1
The following changes since commit 8c1ecb590497b0349c550607db923972b37f6963:
1
The following changes since commit ac793156f650ae2d77834932d72224175ee69086:
2
2
3
Merge remote-tracking branch 'remotes/stsquad/tags/pull-testing-next-280519-2' into staging (2019-05-28 17:38:32 +0100)
3
Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20201020-1' into staging (2020-10-20 21:11:35 +0100)
4
4
5
are available in the Git repository at:
5
are available in the Git repository at:
6
6
7
https://github.com/XanClic/qemu.git tags/pull-block-2019-05-28
7
https://gitlab.com/stefanha/qemu.git tags/block-pull-request
8
8
9
for you to fetch changes up to a2d665c1bc3624a8375e2f9a7d569f7565cc1358:
9
for you to fetch changes up to 32a3fd65e7e3551337fd26bfc0e2f899d70c028c:
10
10
11
blockdev: loosen restrictions on drive-backup source node (2019-05-28 20:30:55 +0200)
11
iotests: add commit top->base cases to 274 (2020-10-22 09:55:39 +0100)
12
12
13
----------------------------------------------------------------
13
----------------------------------------------------------------
14
Block patches:
14
Pull request
15
- qcow2: Use threads for encrypted I/O
15
16
- qemu-img rebase: Optimizations
16
v2:
17
- backup job: Allow any source node, and some refactoring
17
* Fix format string issues on 32-bit hosts [Peter]
18
- Some general simplifications in the block layer
18
* Fix qemu-nbd.c CONFIG_POSIX ifdef issue [Eric]
19
* Fix missing eventfd.h header on macOS [Peter]
20
* Drop unreliable vhost-user-blk test (will send a new patch when ready) [Peter]
21
22
This pull request contains the vhost-user-blk server by Coiby Xu along with my
23
additions, block/nvme.c alignment and hardware error statistics by Philippe
24
Mathieu-Daudé, and bdrv_co_block_status_above() fixes by Vladimir
25
Sementsov-Ogievskiy.
19
26
20
----------------------------------------------------------------
27
----------------------------------------------------------------
21
Alberto Garcia (2):
22
block: Use bdrv_unref_child() for all children in bdrv_close()
23
block: Make bdrv_root_attach_child() unref child_bs on failure
24
28
25
Andrey Shinkevich (1):
29
Coiby Xu (6):
26
qcow2-bitmap: initialize bitmap directory alignment
30
libvhost-user: Allow vu_message_read to be replaced
31
libvhost-user: remove watch for kick_fd when de-initialize vu-dev
32
util/vhost-user-server: generic vhost user server
33
block: move logical block size check function to a common utility
34
function
35
block/export: vhost-user block device backend server
36
MAINTAINERS: Add vhost-user block device backend server maintainer
27
37
28
Anton Nefedov (1):
38
Philippe Mathieu-Daudé (1):
29
qcow2: skip writing zero buffers to empty COW areas
39
block/nvme: Add driver statistics for access alignment and hw errors
30
40
31
John Snow (1):
41
Stefan Hajnoczi (16):
32
blockdev: loosen restrictions on drive-backup source node
42
util/vhost-user-server: s/fileds/fields/ typo fix
43
util/vhost-user-server: drop unnecessary QOM cast
44
util/vhost-user-server: drop unnecessary watch deletion
45
block/export: consolidate request structs into VuBlockReq
46
util/vhost-user-server: drop unused DevicePanicNotifier
47
util/vhost-user-server: fix memory leak in vu_message_read()
48
util/vhost-user-server: check EOF when reading payload
49
util/vhost-user-server: rework vu_client_trip() coroutine lifecycle
50
block/export: report flush errors
51
block/export: convert vhost-user-blk server to block export API
52
util/vhost-user-server: move header to include/
53
util/vhost-user-server: use static library in meson.build
54
qemu-storage-daemon: avoid compiling blockdev_ss twice
55
block: move block exports to libblockdev
56
block/export: add iothread and fixed-iothread options
57
block/export: add vhost-user-blk multi-queue support
33
58
34
Sam Eiderman (3):
59
Vladimir Sementsov-Ogievskiy (5):
35
qemu-img: rebase: Reuse parent BlockDriverState
60
block/io: fix bdrv_co_block_status_above
36
qemu-img: rebase: Reduce reads on in-chain rebase
61
block/io: bdrv_common_block_status_above: support include_base
37
qemu-img: rebase: Reuse in-chain BlockDriverState
62
block/io: bdrv_common_block_status_above: support bs == base
63
block/io: fix bdrv_is_allocated_above
64
iotests: add commit top->base cases to 274
38
65
39
Vladimir Sementsov-Ogievskiy (13):
66
MAINTAINERS | 9 +
40
qcow2.h: add missing include
67
qapi/block-core.json | 24 +-
41
qcow2: add separate file for threaded data processing functions
68
qapi/block-export.json | 36 +-
42
qcow2-threads: use thread_pool_submit_co
69
block/coroutines.h | 2 +
43
qcow2-threads: qcow2_co_do_compress: protect queuing by mutex
70
block/export/vhost-user-blk-server.h | 19 +
44
qcow2-threads: split out generic path
71
contrib/libvhost-user/libvhost-user.h | 21 +
45
qcow2: qcow2_co_preadv: improve locking
72
include/qemu/vhost-user-server.h | 65 +++
46
qcow2: bdrv_co_pwritev: move encryption code out of the lock
73
util/block-helpers.h | 19 +
47
qcow2: do encryption in threads
74
block/export/export.c | 37 +-
48
block/backup: simplify backup_incremental_init_copy_bitmap
75
block/export/vhost-user-blk-server.c | 431 ++++++++++++++++++++
49
block/backup: move to copy_bitmap with granularity
76
block/io.c | 132 +++---
50
block/backup: refactor and tolerate unallocated cluster skipping
77
block/nvme.c | 27 ++
51
block/backup: unify different modes code path
78
block/qcow2.c | 16 +-
52
block/backup: refactor: split out backup_calculate_cluster_size
79
contrib/libvhost-user/libvhost-user-glib.c | 2 +-
53
80
contrib/libvhost-user/libvhost-user.c | 15 +-
54
block/Makefile.objs | 2 +-
81
hw/core/qdev-properties-system.c | 31 +-
55
qapi/block-core.json | 4 +-
82
nbd/server.c | 2 -
56
block/qcow2.h | 26 ++-
83
qemu-nbd.c | 21 +-
57
block.c | 46 +++---
84
softmmu/vl.c | 4 +
58
block/backup.c | 243 ++++++++++++---------------
85
stubs/blk-exp-close-all.c | 7 +
59
block/block-backend.c | 3 +-
86
tests/vhost-user-bridge.c | 2 +
60
block/qcow2-bitmap.c | 3 +-
87
tools/virtiofsd/fuse_virtio.c | 4 +-
61
block/qcow2-cache.c | 1 -
88
util/block-helpers.c | 46 +++
62
block/qcow2-cluster.c | 10 +-
89
util/vhost-user-server.c | 446 +++++++++++++++++++++
63
block/qcow2-refcount.c | 1 -
90
block/export/meson.build | 3 +-
64
block/qcow2-snapshot.c | 1 -
91
contrib/libvhost-user/meson.build | 1 +
65
block/qcow2-threads.c | 268 ++++++++++++++++++++++++++++++
92
meson.build | 22 +-
66
block/qcow2.c | 320 +++++++++++++-----------------------
93
nbd/meson.build | 2 +
67
block/quorum.c | 1 -
94
storage-daemon/meson.build | 3 +-
68
blockdev.c | 7 +-
95
stubs/meson.build | 1 +
69
blockjob.c | 2 +-
96
tests/qemu-iotests/274 | 20 +
70
qemu-img.c | 85 ++++++----
97
tests/qemu-iotests/274.out | 68 ++++
71
tests/test-bdrv-drain.c | 6 -
98
util/meson.build | 4 +
72
tests/test-bdrv-graph-mod.c | 1 -
99
33 files changed, 1420 insertions(+), 122 deletions(-)
73
block/trace-events | 1 +
100
create mode 100644 block/export/vhost-user-blk-server.h
74
tests/qemu-iotests/056 | 2 +-
101
create mode 100644 include/qemu/vhost-user-server.h
75
tests/qemu-iotests/060 | 7 +-
102
create mode 100644 util/block-helpers.h
76
tests/qemu-iotests/060.out | 5 +-
103
create mode 100644 block/export/vhost-user-blk-server.c
77
23 files changed, 615 insertions(+), 430 deletions(-)
104
create mode 100644 stubs/blk-exp-close-all.c
78
create mode 100644 block/qcow2-threads.c
105
create mode 100644 util/block-helpers.c
106
create mode 100644 util/vhost-user-server.c
79
107
80
--
108
--
81
2.21.0
109
2.26.2
82
110
83
diff view generated by jsdifflib
1
From: Anton Nefedov <anton.nefedov@virtuozzo.com>
1
From: Philippe Mathieu-Daudé <philmd@redhat.com>
2
2
3
If COW areas of the newly allocated clusters are zeroes on the backing
3
Keep statistics of some hardware errors, and number of
4
image, efficient bdrv_write_zeroes(flags=BDRV_REQ_NO_FALLBACK) can be
4
aligned/unaligned I/O accesses.
5
used on the whole cluster instead of writing explicit zero buffers later
6
in perform_cow().
7
5
8
iotest 060:
6
QMP example booting a full RHEL 8.3 aarch64 guest:
9
write to the discarded cluster does not trigger COW anymore.
10
Use a backing image instead.
11
7
12
Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com>
8
{ "execute": "query-blockstats" }
13
Message-id: 20190516142749.81019-2-anton.nefedov@virtuozzo.com
9
{
14
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
10
"return": [
15
Reviewed-by: Alberto Garcia <berto@igalia.com>
11
{
16
Signed-off-by: Max Reitz <mreitz@redhat.com>
12
"device": "",
13
"node-name": "drive0",
14
"stats": {
15
"flush_total_time_ns": 6026948,
16
"wr_highest_offset": 3383991230464,
17
"wr_total_time_ns": 807450995,
18
"failed_wr_operations": 0,
19
"failed_rd_operations": 0,
20
"wr_merged": 3,
21
"wr_bytes": 50133504,
22
"failed_unmap_operations": 0,
23
"failed_flush_operations": 0,
24
"account_invalid": false,
25
"rd_total_time_ns": 1846979900,
26
"flush_operations": 130,
27
"wr_operations": 659,
28
"rd_merged": 1192,
29
"rd_bytes": 218244096,
30
"account_failed": false,
31
"idle_time_ns": 2678641497,
32
"rd_operations": 7406,
33
},
34
"driver-specific": {
35
"driver": "nvme",
36
"completion-errors": 0,
37
"unaligned-accesses": 2959,
38
"aligned-accesses": 4477
39
},
40
"qdev": "/machine/peripheral-anon/device[0]/virtio-backend"
41
}
42
]
43
}
44
45
Suggested-by: Stefan Hajnoczi <stefanha@gmail.com>
46
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
47
Acked-by: Markus Armbruster <armbru@redhat.com>
48
Message-id: 20201001162939.1567915-1-philmd@redhat.com
49
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
17
---
50
---
18
qapi/block-core.json | 4 +-
51
qapi/block-core.json | 24 +++++++++++++++++++++++-
19
block/qcow2.h | 6 +++
52
block/nvme.c | 27 +++++++++++++++++++++++++++
20
block/qcow2-cluster.c | 2 +-
53
2 files changed, 50 insertions(+), 1 deletion(-)
21
block/qcow2.c | 85 ++++++++++++++++++++++++++++++++++++++
22
block/trace-events | 1 +
23
tests/qemu-iotests/060 | 7 +++-
24
tests/qemu-iotests/060.out | 5 ++-
25
7 files changed, 106 insertions(+), 4 deletions(-)
26
54
27
diff --git a/qapi/block-core.json b/qapi/block-core.json
55
diff --git a/qapi/block-core.json b/qapi/block-core.json
28
index XXXXXXX..XXXXXXX 100644
56
index XXXXXXX..XXXXXXX 100644
29
--- a/qapi/block-core.json
57
--- a/qapi/block-core.json
30
+++ b/qapi/block-core.json
58
+++ b/qapi/block-core.json
31
@@ -XXX,XX +XXX,XX @@
59
@@ -XXX,XX +XXX,XX @@
60
'discard-nb-failed': 'uint64',
61
'discard-bytes-ok': 'uint64' } }
62
63
+##
64
+# @BlockStatsSpecificNvme:
65
+#
66
+# NVMe driver statistics
67
+#
68
+# @completion-errors: The number of completion errors.
69
+#
70
+# @aligned-accesses: The number of aligned accesses performed by
71
+# the driver.
72
+#
73
+# @unaligned-accesses: The number of unaligned accesses performed by
74
+# the driver.
75
+#
76
+# Since: 5.2
77
+##
78
+{ 'struct': 'BlockStatsSpecificNvme',
79
+ 'data': {
80
+ 'completion-errors': 'uint64',
81
+ 'aligned-accesses': 'uint64',
82
+ 'unaligned-accesses': 'uint64' } }
83
+
84
##
85
# @BlockStatsSpecific:
32
#
86
#
33
# @cor_write: a write due to copy-on-read (since 2.11)
87
@@ -XXX,XX +XXX,XX @@
34
#
88
'discriminator': 'driver',
35
+# @cluster_alloc_space: an allocation of file space for a cluster (since 4.1)
89
'data': {
36
+#
90
'file': 'BlockStatsSpecificFile',
37
# Since: 2.9
91
- 'host_device': 'BlockStatsSpecificFile' } }
92
+ 'host_device': 'BlockStatsSpecificFile',
93
+ 'nvme': 'BlockStatsSpecificNvme' } }
94
38
##
95
##
39
{ 'enum': 'BlkdebugEvent', 'prefix': 'BLKDBG',
96
# @BlockStats:
40
@@ -XXX,XX +XXX,XX @@
97
diff --git a/block/nvme.c b/block/nvme.c
41
'pwritev_rmw_tail', 'pwritev_rmw_after_tail', 'pwritev',
42
'pwritev_zero', 'pwritev_done', 'empty_image_prepare',
43
'l1_shrink_write_table', 'l1_shrink_free_l2_clusters',
44
- 'cor_write'] }
45
+ 'cor_write', 'cluster_alloc_space'] }
46
47
##
48
# @BlkdebugInjectErrorOptions:
49
diff --git a/block/qcow2.h b/block/qcow2.h
50
index XXXXXXX..XXXXXXX 100644
98
index XXXXXXX..XXXXXXX 100644
51
--- a/block/qcow2.h
99
--- a/block/nvme.c
52
+++ b/block/qcow2.h
100
+++ b/block/nvme.c
53
@@ -XXX,XX +XXX,XX @@ typedef struct QCowL2Meta
101
@@ -XXX,XX +XXX,XX @@ struct BDRVNVMeState {
54
*/
102
55
Qcow2COWRegion cow_end;
103
/* PCI address (required for nvme_refresh_filename()) */
56
104
char *device;
57
+ /*
58
+ * Indicates that COW regions are already handled and do not require
59
+ * any more processing.
60
+ */
61
+ bool skip_cow;
62
+
105
+
63
/**
106
+ struct {
64
* The I/O vector with the data from the actual guest write request.
107
+ uint64_t completion_errors;
65
* If non-NULL, this is meant to be merged together with the data
108
+ uint64_t aligned_accesses;
66
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
109
+ uint64_t unaligned_accesses;
67
index XXXXXXX..XXXXXXX 100644
110
+ } stats;
68
--- a/block/qcow2-cluster.c
111
};
69
+++ b/block/qcow2-cluster.c
112
70
@@ -XXX,XX +XXX,XX @@ static int perform_cow(BlockDriverState *bs, QCowL2Meta *m)
113
#define NVME_BLOCK_OPT_DEVICE "device"
71
assert(start->offset + start->nb_bytes <= end->offset);
114
@@ -XXX,XX +XXX,XX @@ static bool nvme_process_completion(NVMeQueuePair *q)
72
assert(!m->data_qiov || m->data_qiov->size == data_bytes);
115
break;
73
116
}
74
- if (start->nb_bytes == 0 && end->nb_bytes == 0) {
117
ret = nvme_translate_error(c);
75
+ if ((start->nb_bytes == 0 && end->nb_bytes == 0) || m->skip_cow) {
118
+ if (ret) {
76
return 0;
119
+ s->stats.completion_errors++;
120
+ }
121
q->cq.head = (q->cq.head + 1) % NVME_QUEUE_SIZE;
122
if (!q->cq.head) {
123
q->cq_phase = !q->cq_phase;
124
@@ -XXX,XX +XXX,XX @@ static int nvme_co_prw(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
125
assert(QEMU_IS_ALIGNED(bytes, s->page_size));
126
assert(bytes <= s->max_transfer);
127
if (nvme_qiov_aligned(bs, qiov)) {
128
+ s->stats.aligned_accesses++;
129
return nvme_co_prw_aligned(bs, offset, bytes, qiov, is_write, flags);
77
}
130
}
78
131
+ s->stats.unaligned_accesses++;
79
diff --git a/block/qcow2.c b/block/qcow2.c
132
trace_nvme_prw_buffered(s, offset, bytes, qiov->niov, is_write);
80
index XXXXXXX..XXXXXXX 100644
133
buf = qemu_try_memalign(s->page_size, bytes);
81
--- a/block/qcow2.c
134
82
+++ b/block/qcow2.c
135
@@ -XXX,XX +XXX,XX @@ static void nvme_unregister_buf(BlockDriverState *bs, void *host)
83
@@ -XXX,XX +XXX,XX @@ static bool merge_cow(uint64_t offset, unsigned bytes,
136
qemu_vfio_dma_unmap(s->vfio, host);
84
continue;
137
}
85
}
138
86
139
+static BlockStatsSpecific *nvme_get_specific_stats(BlockDriverState *bs)
87
+ /* If COW regions are handled already, skip this too */
140
+{
88
+ if (m->skip_cow) {
141
+ BlockStatsSpecific *stats = g_new(BlockStatsSpecific, 1);
89
+ continue;
142
+ BDRVNVMeState *s = bs->opaque;
90
+ }
91
+
143
+
92
/* The data (middle) region must be immediately after the
144
+ stats->driver = BLOCKDEV_DRIVER_NVME;
93
* start region */
145
+ stats->u.nvme = (BlockStatsSpecificNvme) {
94
if (l2meta_cow_start(m) + m->cow_start.nb_bytes != offset) {
146
+ .completion_errors = s->stats.completion_errors,
95
@@ -XXX,XX +XXX,XX @@ static bool merge_cow(uint64_t offset, unsigned bytes,
147
+ .aligned_accesses = s->stats.aligned_accesses,
96
return false;
148
+ .unaligned_accesses = s->stats.unaligned_accesses,
97
}
149
+ };
98
150
+
99
+static bool is_unallocated(BlockDriverState *bs, int64_t offset, int64_t bytes)
151
+ return stats;
100
+{
101
+ int64_t nr;
102
+ return !bytes ||
103
+ (!bdrv_is_allocated_above(bs, NULL, offset, bytes, &nr) && nr == bytes);
104
+}
152
+}
105
+
153
+
106
+static bool is_zero_cow(BlockDriverState *bs, QCowL2Meta *m)
154
static const char *const nvme_strong_runtime_opts[] = {
107
+{
155
NVME_BLOCK_OPT_DEVICE,
108
+ /*
156
NVME_BLOCK_OPT_NAMESPACE,
109
+ * This check is designed for optimization shortcut so it must be
157
@@ -XXX,XX +XXX,XX @@ static BlockDriver bdrv_nvme = {
110
+ * efficient.
158
.bdrv_refresh_filename = nvme_refresh_filename,
111
+ * Instead of is_zero(), use is_unallocated() as it is faster (but not
159
.bdrv_refresh_limits = nvme_refresh_limits,
112
+ * as accurate and can result in false negatives).
160
.strong_runtime_opts = nvme_strong_runtime_opts,
113
+ */
161
+ .bdrv_get_specific_stats = nvme_get_specific_stats,
114
+ return is_unallocated(bs, m->offset + m->cow_start.offset,
162
115
+ m->cow_start.nb_bytes) &&
163
.bdrv_detach_aio_context = nvme_detach_aio_context,
116
+ is_unallocated(bs, m->offset + m->cow_end.offset,
164
.bdrv_attach_aio_context = nvme_attach_aio_context,
117
+ m->cow_end.nb_bytes);
118
+}
119
+
120
+static int handle_alloc_space(BlockDriverState *bs, QCowL2Meta *l2meta)
121
+{
122
+ BDRVQcow2State *s = bs->opaque;
123
+ QCowL2Meta *m;
124
+
125
+ if (!(s->data_file->bs->supported_zero_flags & BDRV_REQ_NO_FALLBACK)) {
126
+ return 0;
127
+ }
128
+
129
+ if (bs->encrypted) {
130
+ return 0;
131
+ }
132
+
133
+ for (m = l2meta; m != NULL; m = m->next) {
134
+ int ret;
135
+
136
+ if (!m->cow_start.nb_bytes && !m->cow_end.nb_bytes) {
137
+ continue;
138
+ }
139
+
140
+ if (!is_zero_cow(bs, m)) {
141
+ continue;
142
+ }
143
+
144
+ /*
145
+ * instead of writing zero COW buffers,
146
+ * efficiently zero out the whole clusters
147
+ */
148
+
149
+ ret = qcow2_pre_write_overlap_check(bs, 0, m->alloc_offset,
150
+ m->nb_clusters * s->cluster_size,
151
+ true);
152
+ if (ret < 0) {
153
+ return ret;
154
+ }
155
+
156
+ BLKDBG_EVENT(bs->file, BLKDBG_CLUSTER_ALLOC_SPACE);
157
+ ret = bdrv_co_pwrite_zeroes(s->data_file, m->alloc_offset,
158
+ m->nb_clusters * s->cluster_size,
159
+ BDRV_REQ_NO_FALLBACK);
160
+ if (ret < 0) {
161
+ if (ret != -ENOTSUP && ret != -EAGAIN) {
162
+ return ret;
163
+ }
164
+ continue;
165
+ }
166
+
167
+ trace_qcow2_skip_cow(qemu_coroutine_self(), m->offset, m->nb_clusters);
168
+ m->skip_cow = true;
169
+ }
170
+ return 0;
171
+}
172
+
173
static coroutine_fn int qcow2_co_pwritev(BlockDriverState *bs, uint64_t offset,
174
uint64_t bytes, QEMUIOVector *qiov,
175
int flags)
176
@@ -XXX,XX +XXX,XX @@ static coroutine_fn int qcow2_co_pwritev(BlockDriverState *bs, uint64_t offset,
177
qemu_iovec_add(&hd_qiov, cluster_data, cur_bytes);
178
}
179
180
+ /* Try to efficiently initialize the physical space with zeroes */
181
+ ret = handle_alloc_space(bs, l2meta);
182
+ if (ret < 0) {
183
+ goto out_unlocked;
184
+ }
185
+
186
/* If we need to do COW, check if it's possible to merge the
187
* writing of the guest data together with that of the COW regions.
188
* If it's not possible (or not necessary) then write the
189
diff --git a/block/trace-events b/block/trace-events
190
index XXXXXXX..XXXXXXX 100644
191
--- a/block/trace-events
192
+++ b/block/trace-events
193
@@ -XXX,XX +XXX,XX @@ qcow2_writev_done_part(void *co, int cur_bytes) "co %p cur_bytes %d"
194
qcow2_writev_data(void *co, uint64_t offset) "co %p offset 0x%" PRIx64
195
qcow2_pwrite_zeroes_start_req(void *co, int64_t offset, int count) "co %p offset 0x%" PRIx64 " count %d"
196
qcow2_pwrite_zeroes(void *co, int64_t offset, int count) "co %p offset 0x%" PRIx64 " count %d"
197
+qcow2_skip_cow(void *co, uint64_t offset, int nb_clusters) "co %p offset 0x%" PRIx64 " nb_clusters %d"
198
199
# qcow2-cluster.c
200
qcow2_alloc_clusters_offset(void *co, uint64_t offset, int bytes) "co %p offset 0x%" PRIx64 " bytes %d"
201
diff --git a/tests/qemu-iotests/060 b/tests/qemu-iotests/060
202
index XXXXXXX..XXXXXXX 100755
203
--- a/tests/qemu-iotests/060
204
+++ b/tests/qemu-iotests/060
205
@@ -XXX,XX +XXX,XX @@ $QEMU_IO -c "$OPEN_RO" -c "read -P 1 0 512" | _filter_qemu_io
206
echo
207
echo "=== Testing overlap while COW is in flight ==="
208
echo
209
+BACKING_IMG=$TEST_IMG.base
210
+TEST_IMG=$BACKING_IMG _make_test_img 1G
211
+
212
+$QEMU_IO -c 'write 0k 64k' "$BACKING_IMG" | _filter_qemu_io
213
+
214
# compat=0.10 is required in order to make the following discard actually
215
# unallocate the sector rather than make it a zero sector - we want COW, after
216
# all.
217
-IMGOPTS='compat=0.10' _make_test_img 1G
218
+IMGOPTS='compat=0.10' _make_test_img -b "$BACKING_IMG" 1G
219
# Write two clusters, the second one enforces creation of an L2 table after
220
# the first data cluster.
221
$QEMU_IO -c 'write 0k 64k' -c 'write 512M 64k' "$TEST_IMG" | _filter_qemu_io
222
diff --git a/tests/qemu-iotests/060.out b/tests/qemu-iotests/060.out
223
index XXXXXXX..XXXXXXX 100644
224
--- a/tests/qemu-iotests/060.out
225
+++ b/tests/qemu-iotests/060.out
226
@@ -XXX,XX +XXX,XX @@ read 512/512 bytes at offset 0
227
228
=== Testing overlap while COW is in flight ===
229
230
-Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1073741824
231
+Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=1073741824
232
+wrote 65536/65536 bytes at offset 0
233
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
234
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1073741824 backing_file=TEST_DIR/t.IMGFMT.base
235
wrote 65536/65536 bytes at offset 0
236
64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
237
wrote 65536/65536 bytes at offset 536870912
238
--
165
--
239
2.21.0
166
2.26.2
240
167
241
diff view generated by jsdifflib
1
From: Alberto Garcia <berto@igalia.com>
1
From: Coiby Xu <coiby.xu@gmail.com>
2
2
3
A consequence of the previous patch is that bdrv_attach_child()
3
Allow vu_message_read to be replaced by one which will make use of the
4
transfers the reference to child_bs from the caller to parent_bs,
4
QIOChannel functions. Thus reading vhost-user message won't stall the
5
which will drop it on bdrv_close() or when someone calls
5
guest. For slave channel, we still use the default vu_message_read.
6
bdrv_unref_child().
7
6
8
But this only happens when bdrv_attach_child() succeeds. If it fails
7
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
9
then the caller is responsible for dropping the reference to child_bs.
8
Signed-off-by: Coiby Xu <coiby.xu@gmail.com>
9
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
10
Message-id: 20200918080912.321299-2-coiby.xu@gmail.com
11
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
12
---
13
contrib/libvhost-user/libvhost-user.h | 21 +++++++++++++++++++++
14
contrib/libvhost-user/libvhost-user-glib.c | 2 +-
15
contrib/libvhost-user/libvhost-user.c | 14 +++++++-------
16
tests/vhost-user-bridge.c | 2 ++
17
tools/virtiofsd/fuse_virtio.c | 4 ++--
18
5 files changed, 33 insertions(+), 10 deletions(-)
10
19
11
This patch makes bdrv_attach_child() take the reference also when
20
diff --git a/contrib/libvhost-user/libvhost-user.h b/contrib/libvhost-user/libvhost-user.h
12
there is an error, freeing the caller for having to do it.
13
14
A similar situation happens with bdrv_root_attach_child(), so the
15
changes on this patch affect both functions.
16
17
Signed-off-by: Alberto Garcia <berto@igalia.com>
18
Message-id: 20dfb3d9ccec559cdd1a9690146abad5d204a186.1557754872.git.berto@igalia.com
19
[mreitz: Removed now superfluous BdrvChild * variable in
20
bdrv_open_child()]
21
Signed-off-by: Max Reitz <mreitz@redhat.com>
22
---
23
block.c | 30 ++++++++++++++++++------------
24
block/block-backend.c | 3 +--
25
block/quorum.c | 1 -
26
blockjob.c | 2 +-
27
4 files changed, 20 insertions(+), 16 deletions(-)
28
29
diff --git a/block.c b/block.c
30
index XXXXXXX..XXXXXXX 100644
21
index XXXXXXX..XXXXXXX 100644
31
--- a/block.c
22
--- a/contrib/libvhost-user/libvhost-user.h
32
+++ b/block.c
23
+++ b/contrib/libvhost-user/libvhost-user.h
33
@@ -XXX,XX +XXX,XX @@ static void bdrv_replace_child(BdrvChild *child, BlockDriverState *new_bs)
24
@@ -XXX,XX +XXX,XX @@
25
*/
26
#define VHOST_USER_MAX_RAM_SLOTS 32
27
28
+#define VHOST_USER_HDR_SIZE offsetof(VhostUserMsg, payload.u64)
29
+
30
typedef enum VhostSetConfigType {
31
VHOST_SET_CONFIG_TYPE_MASTER = 0,
32
VHOST_SET_CONFIG_TYPE_MIGRATION = 1,
33
@@ -XXX,XX +XXX,XX @@ typedef uint64_t (*vu_get_features_cb) (VuDev *dev);
34
typedef void (*vu_set_features_cb) (VuDev *dev, uint64_t features);
35
typedef int (*vu_process_msg_cb) (VuDev *dev, VhostUserMsg *vmsg,
36
int *do_reply);
37
+typedef bool (*vu_read_msg_cb) (VuDev *dev, int sock, VhostUserMsg *vmsg);
38
typedef void (*vu_queue_set_started_cb) (VuDev *dev, int qidx, bool started);
39
typedef bool (*vu_queue_is_processed_in_order_cb) (VuDev *dev, int qidx);
40
typedef int (*vu_get_config_cb) (VuDev *dev, uint8_t *config, uint32_t len);
41
@@ -XXX,XX +XXX,XX @@ struct VuDev {
42
bool broken;
43
uint16_t max_queues;
44
45
+ /* @read_msg: custom method to read vhost-user message
46
+ *
47
+ * Read data from vhost_user socket fd and fill up
48
+ * the passed VhostUserMsg *vmsg struct.
49
+ *
50
+ * If reading fails, it should close the received set of file
51
+ * descriptors as socket message's auxiliary data.
52
+ *
53
+ * For the details, please refer to vu_message_read in libvhost-user.c
54
+ * which will be used by default if not custom method is provided when
55
+ * calling vu_init
56
+ *
57
+ * Returns: true if vhost-user message successfully received,
58
+ * otherwise return false.
59
+ *
60
+ */
61
+ vu_read_msg_cb read_msg;
62
/* @set_watch: add or update the given fd to the watch set,
63
* call cb when condition is met */
64
vu_set_watch_cb set_watch;
65
@@ -XXX,XX +XXX,XX @@ bool vu_init(VuDev *dev,
66
uint16_t max_queues,
67
int socket,
68
vu_panic_cb panic,
69
+ vu_read_msg_cb read_msg,
70
vu_set_watch_cb set_watch,
71
vu_remove_watch_cb remove_watch,
72
const VuDevIface *iface);
73
diff --git a/contrib/libvhost-user/libvhost-user-glib.c b/contrib/libvhost-user/libvhost-user-glib.c
74
index XXXXXXX..XXXXXXX 100644
75
--- a/contrib/libvhost-user/libvhost-user-glib.c
76
+++ b/contrib/libvhost-user/libvhost-user-glib.c
77
@@ -XXX,XX +XXX,XX @@ vug_init(VugDev *dev, uint16_t max_queues, int socket,
78
g_assert(dev);
79
g_assert(iface);
80
81
- if (!vu_init(&dev->parent, max_queues, socket, panic, set_watch,
82
+ if (!vu_init(&dev->parent, max_queues, socket, panic, NULL, set_watch,
83
remove_watch, iface)) {
84
return false;
34
}
85
}
86
diff --git a/contrib/libvhost-user/libvhost-user.c b/contrib/libvhost-user/libvhost-user.c
87
index XXXXXXX..XXXXXXX 100644
88
--- a/contrib/libvhost-user/libvhost-user.c
89
+++ b/contrib/libvhost-user/libvhost-user.c
90
@@ -XXX,XX +XXX,XX @@
91
/* The version of inflight buffer */
92
#define INFLIGHT_VERSION 1
93
94
-#define VHOST_USER_HDR_SIZE offsetof(VhostUserMsg, payload.u64)
95
-
96
/* The version of the protocol we support */
97
#define VHOST_USER_VERSION 1
98
#define LIBVHOST_USER_DEBUG 0
99
@@ -XXX,XX +XXX,XX @@ have_userfault(void)
35
}
100
}
36
101
37
+/*
102
static bool
38
+ * This function steals the reference to child_bs from the caller.
103
-vu_message_read(VuDev *dev, int conn_fd, VhostUserMsg *vmsg)
39
+ * That reference is later dropped by bdrv_root_unref_child().
104
+vu_message_read_default(VuDev *dev, int conn_fd, VhostUserMsg *vmsg)
40
+ *
41
+ * On failure NULL is returned, errp is set and the reference to
42
+ * child_bs is also dropped.
43
+ */
44
BdrvChild *bdrv_root_attach_child(BlockDriverState *child_bs,
45
const char *child_name,
46
const BdrvChildRole *child_role,
47
@@ -XXX,XX +XXX,XX @@ BdrvChild *bdrv_root_attach_child(BlockDriverState *child_bs,
48
ret = bdrv_check_update_perm(child_bs, NULL, perm, shared_perm, NULL, errp);
49
if (ret < 0) {
50
bdrv_abort_perm_update(child_bs);
51
+ bdrv_unref(child_bs);
52
return NULL;
53
}
54
55
@@ -XXX,XX +XXX,XX @@ BdrvChild *bdrv_root_attach_child(BlockDriverState *child_bs,
56
return child;
57
}
58
59
+/*
60
+ * This function transfers the reference to child_bs from the caller
61
+ * to parent_bs. That reference is later dropped by parent_bs on
62
+ * bdrv_close() or if someone calls bdrv_unref_child().
63
+ *
64
+ * On failure NULL is returned, errp is set and the reference to
65
+ * child_bs is also dropped.
66
+ */
67
BdrvChild *bdrv_attach_child(BlockDriverState *parent_bs,
68
BlockDriverState *child_bs,
69
const char *child_name,
70
@@ -XXX,XX +XXX,XX @@ void bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd,
71
/* If backing_hd was already part of bs's backing chain, and
72
* inherits_from pointed recursively to bs then let's update it to
73
* point directly to bs (else it will become NULL). */
74
- if (update_inherits_from) {
75
+ if (bs->backing && update_inherits_from) {
76
backing_hd->inherits_from = bs;
77
}
78
- if (!bs->backing) {
79
- bdrv_unref(backing_hd);
80
- }
81
82
out:
83
bdrv_refresh_limits(bs, NULL);
84
@@ -XXX,XX +XXX,XX @@ BdrvChild *bdrv_open_child(const char *filename,
85
const BdrvChildRole *child_role,
86
bool allow_none, Error **errp)
87
{
105
{
88
- BdrvChild *c;
106
char control[CMSG_SPACE(VHOST_MEMORY_BASELINE_NREGIONS * sizeof(int))] = {};
89
BlockDriverState *bs;
107
struct iovec iov = {
90
108
@@ -XXX,XX +XXX,XX @@ vu_process_message_reply(VuDev *dev, const VhostUserMsg *vmsg)
91
bs = bdrv_open_child_bs(filename, options, bdref_key, parent, child_role,
92
@@ -XXX,XX +XXX,XX @@ BdrvChild *bdrv_open_child(const char *filename,
93
return NULL;
94
}
95
96
- c = bdrv_attach_child(parent, bs, bdref_key, child_role, errp);
97
- if (!c) {
98
- bdrv_unref(bs);
99
- return NULL;
100
- }
101
-
102
- return c;
103
+ return bdrv_attach_child(parent, bs, bdref_key, child_role, errp);
104
}
105
106
/* TODO Future callers may need to specify parent/child_role in order for
107
diff --git a/block/block-backend.c b/block/block-backend.c
108
index XXXXXXX..XXXXXXX 100644
109
--- a/block/block-backend.c
110
+++ b/block/block-backend.c
111
@@ -XXX,XX +XXX,XX @@ BlockBackend *blk_new_open(const char *filename, const char *reference,
112
blk->root = bdrv_root_attach_child(bs, "root", &child_root,
113
perm, BLK_PERM_ALL, blk, errp);
114
if (!blk->root) {
115
- bdrv_unref(bs);
116
blk_unref(blk);
117
return NULL;
118
}
119
@@ -XXX,XX +XXX,XX @@ void blk_remove_bs(BlockBackend *blk)
120
int blk_insert_bs(BlockBackend *blk, BlockDriverState *bs, Error **errp)
121
{
122
ThrottleGroupMember *tgm = &blk->public.throttle_group_member;
123
+ bdrv_ref(bs);
124
blk->root = bdrv_root_attach_child(bs, "root", &child_root,
125
blk->perm, blk->shared_perm, blk, errp);
126
if (blk->root == NULL) {
127
return -EPERM;
128
}
129
- bdrv_ref(bs);
130
131
notifier_list_notify(&blk->insert_bs_notifiers, blk);
132
if (tgm->throttle_state) {
133
diff --git a/block/quorum.c b/block/quorum.c
134
index XXXXXXX..XXXXXXX 100644
135
--- a/block/quorum.c
136
+++ b/block/quorum.c
137
@@ -XXX,XX +XXX,XX @@ static void quorum_add_child(BlockDriverState *bs, BlockDriverState *child_bs,
138
child = bdrv_attach_child(bs, child_bs, indexstr, &child_format, errp);
139
if (child == NULL) {
140
s->next_child_index--;
141
- bdrv_unref(child_bs);
142
goto out;
109
goto out;
143
}
110
}
144
s->children = g_renew(BdrvChild *, s->children, s->num_children + 1);
111
145
diff --git a/blockjob.c b/blockjob.c
112
- if (!vu_message_read(dev, dev->slave_fd, &msg_reply)) {
113
+ if (!vu_message_read_default(dev, dev->slave_fd, &msg_reply)) {
114
goto out;
115
}
116
117
@@ -XXX,XX +XXX,XX @@ vu_set_mem_table_exec_postcopy(VuDev *dev, VhostUserMsg *vmsg)
118
/* Wait for QEMU to confirm that it's registered the handler for the
119
* faults.
120
*/
121
- if (!vu_message_read(dev, dev->sock, vmsg) ||
122
+ if (!dev->read_msg(dev, dev->sock, vmsg) ||
123
vmsg->size != sizeof(vmsg->payload.u64) ||
124
vmsg->payload.u64 != 0) {
125
vu_panic(dev, "failed to receive valid ack for postcopy set-mem-table");
126
@@ -XXX,XX +XXX,XX @@ vu_dispatch(VuDev *dev)
127
int reply_requested;
128
bool need_reply, success = false;
129
130
- if (!vu_message_read(dev, dev->sock, &vmsg)) {
131
+ if (!dev->read_msg(dev, dev->sock, &vmsg)) {
132
goto end;
133
}
134
135
@@ -XXX,XX +XXX,XX @@ vu_init(VuDev *dev,
136
uint16_t max_queues,
137
int socket,
138
vu_panic_cb panic,
139
+ vu_read_msg_cb read_msg,
140
vu_set_watch_cb set_watch,
141
vu_remove_watch_cb remove_watch,
142
const VuDevIface *iface)
143
@@ -XXX,XX +XXX,XX @@ vu_init(VuDev *dev,
144
145
dev->sock = socket;
146
dev->panic = panic;
147
+ dev->read_msg = read_msg ? read_msg : vu_message_read_default;
148
dev->set_watch = set_watch;
149
dev->remove_watch = remove_watch;
150
dev->iface = iface;
151
@@ -XXX,XX +XXX,XX @@ static void _vu_queue_notify(VuDev *dev, VuVirtq *vq, bool sync)
152
153
vu_message_write(dev, dev->slave_fd, &vmsg);
154
if (ack) {
155
- vu_message_read(dev, dev->slave_fd, &vmsg);
156
+ vu_message_read_default(dev, dev->slave_fd, &vmsg);
157
}
158
return;
159
}
160
diff --git a/tests/vhost-user-bridge.c b/tests/vhost-user-bridge.c
146
index XXXXXXX..XXXXXXX 100644
161
index XXXXXXX..XXXXXXX 100644
147
--- a/blockjob.c
162
--- a/tests/vhost-user-bridge.c
148
+++ b/blockjob.c
163
+++ b/tests/vhost-user-bridge.c
149
@@ -XXX,XX +XXX,XX @@ int block_job_add_bdrv(BlockJob *job, const char *name, BlockDriverState *bs,
164
@@ -XXX,XX +XXX,XX @@ vubr_accept_cb(int sock, void *ctx)
150
{
165
VHOST_USER_BRIDGE_MAX_QUEUES,
151
BdrvChild *c;
166
conn_fd,
152
167
vubr_panic,
153
+ bdrv_ref(bs);
168
+ NULL,
154
c = bdrv_root_attach_child(bs, name, &child_job, perm, shared_perm,
169
vubr_set_watch,
155
job, errp);
170
vubr_remove_watch,
156
if (c == NULL) {
171
&vuiface)) {
157
@@ -XXX,XX +XXX,XX @@ int block_job_add_bdrv(BlockJob *job, const char *name, BlockDriverState *bs,
172
@@ -XXX,XX +XXX,XX @@ vubr_new(const char *path, bool client)
158
}
173
VHOST_USER_BRIDGE_MAX_QUEUES,
159
174
dev->sock,
160
job->nodes = g_slist_prepend(job->nodes, c);
175
vubr_panic,
161
- bdrv_ref(bs);
176
+ NULL,
162
bdrv_op_block_all(bs, job->blocker);
177
vubr_set_watch,
178
vubr_remove_watch,
179
&vuiface)) {
180
diff --git a/tools/virtiofsd/fuse_virtio.c b/tools/virtiofsd/fuse_virtio.c
181
index XXXXXXX..XXXXXXX 100644
182
--- a/tools/virtiofsd/fuse_virtio.c
183
+++ b/tools/virtiofsd/fuse_virtio.c
184
@@ -XXX,XX +XXX,XX @@ int virtio_session_mount(struct fuse_session *se)
185
se->vu_socketfd = data_sock;
186
se->virtio_dev->se = se;
187
pthread_rwlock_init(&se->virtio_dev->vu_dispatch_rwlock, NULL);
188
- vu_init(&se->virtio_dev->dev, 2, se->vu_socketfd, fv_panic, fv_set_watch,
189
- fv_remove_watch, &fv_iface);
190
+ vu_init(&se->virtio_dev->dev, 2, se->vu_socketfd, fv_panic, NULL,
191
+ fv_set_watch, fv_remove_watch, &fv_iface);
163
192
164
return 0;
193
return 0;
194
}
165
--
195
--
166
2.21.0
196
2.26.2
167
197
168
diff view generated by jsdifflib
New patch
1
From: Coiby Xu <coiby.xu@gmail.com>
1
2
3
When the client is running in gdb and quit command is run in gdb,
4
QEMU will still dispatch the event which will cause segment fault in
5
the callback function.
6
7
Signed-off-by: Coiby Xu <coiby.xu@gmail.com>
8
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
9
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
10
Message-id: 20200918080912.321299-3-coiby.xu@gmail.com
11
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
12
---
13
contrib/libvhost-user/libvhost-user.c | 1 +
14
1 file changed, 1 insertion(+)
15
16
diff --git a/contrib/libvhost-user/libvhost-user.c b/contrib/libvhost-user/libvhost-user.c
17
index XXXXXXX..XXXXXXX 100644
18
--- a/contrib/libvhost-user/libvhost-user.c
19
+++ b/contrib/libvhost-user/libvhost-user.c
20
@@ -XXX,XX +XXX,XX @@ vu_deinit(VuDev *dev)
21
}
22
23
if (vq->kick_fd != -1) {
24
+ dev->remove_watch(dev, vq->kick_fd);
25
close(vq->kick_fd);
26
vq->kick_fd = -1;
27
}
28
--
29
2.26.2
30
diff view generated by jsdifflib
1
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
1
From: Coiby Xu <coiby.xu@gmail.com>
2
2
3
Move compression-on-threads to separate file. Encryption will be in it
3
Sharing QEMU devices via vhost-user protocol.
4
too.
5
4
6
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
5
Only one vhost-user client can connect to the server one time.
7
Reviewed-by: Alberto Garcia <berto@igalia.com>
6
8
Reviewed-by: Max Reitz <mreitz@redhat.com>
7
Suggested-by: Kevin Wolf <kwolf@redhat.com>
9
Message-id: 20190506142741.41731-3-vsementsov@virtuozzo.com
8
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10
Signed-off-by: Max Reitz <mreitz@redhat.com>
9
Signed-off-by: Coiby Xu <coiby.xu@gmail.com>
10
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
11
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
12
Message-id: 20200918080912.321299-4-coiby.xu@gmail.com
13
[Fixed size_t %lu -> %zu format string compiler error.
14
--Stefan]
15
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
11
---
16
---
12
block/Makefile.objs | 2 +-
17
util/vhost-user-server.h | 65 ++++++
13
block/qcow2.h | 7 ++
18
util/vhost-user-server.c | 428 +++++++++++++++++++++++++++++++++++++++
14
block/qcow2-threads.c | 201 ++++++++++++++++++++++++++++++++++++++++++
19
util/meson.build | 1 +
15
block/qcow2.c | 169 -----------------------------------
20
3 files changed, 494 insertions(+)
16
4 files changed, 209 insertions(+), 170 deletions(-)
21
create mode 100644 util/vhost-user-server.h
17
create mode 100644 block/qcow2-threads.c
22
create mode 100644 util/vhost-user-server.c
18
23
19
diff --git a/block/Makefile.objs b/block/Makefile.objs
24
diff --git a/util/vhost-user-server.h b/util/vhost-user-server.h
20
index XXXXXXX..XXXXXXX 100644
21
--- a/block/Makefile.objs
22
+++ b/block/Makefile.objs
23
@@ -XXX,XX +XXX,XX @@ block-obj-$(CONFIG_BOCHS) += bochs.o
24
block-obj-$(CONFIG_VVFAT) += vvfat.o
25
block-obj-$(CONFIG_DMG) += dmg.o
26
27
-block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o qcow2-cache.o qcow2-bitmap.o
28
+block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o qcow2-cache.o qcow2-bitmap.o qcow2-threads.o
29
block-obj-$(CONFIG_QED) += qed.o qed-l2-cache.o qed-table.o qed-cluster.o
30
block-obj-$(CONFIG_QED) += qed-check.o
31
block-obj-y += vhdx.o vhdx-endian.o vhdx-log.o
32
diff --git a/block/qcow2.h b/block/qcow2.h
33
index XXXXXXX..XXXXXXX 100644
34
--- a/block/qcow2.h
35
+++ b/block/qcow2.h
36
@@ -XXX,XX +XXX,XX @@ void qcow2_remove_persistent_dirty_bitmap(BlockDriverState *bs,
37
const char *name,
38
Error **errp);
39
40
+ssize_t coroutine_fn
41
+qcow2_co_compress(BlockDriverState *bs, void *dest, size_t dest_size,
42
+ const void *src, size_t src_size);
43
+ssize_t coroutine_fn
44
+qcow2_co_decompress(BlockDriverState *bs, void *dest, size_t dest_size,
45
+ const void *src, size_t src_size);
46
+
47
#endif
48
diff --git a/block/qcow2-threads.c b/block/qcow2-threads.c
49
new file mode 100644
25
new file mode 100644
50
index XXXXXXX..XXXXXXX
26
index XXXXXXX..XXXXXXX
51
--- /dev/null
27
--- /dev/null
52
+++ b/block/qcow2-threads.c
28
+++ b/util/vhost-user-server.h
53
@@ -XXX,XX +XXX,XX @@
29
@@ -XXX,XX +XXX,XX @@
54
+/*
30
+/*
55
+ * Threaded data processing for Qcow2: compression, encryption
31
+ * Sharing QEMU devices via vhost-user protocol
56
+ *
32
+ *
57
+ * Copyright (c) 2004-2006 Fabrice Bellard
33
+ * Copyright (c) Coiby Xu <coiby.xu@gmail.com>.
58
+ * Copyright (c) 2018 Virtuozzo International GmbH. All rights reserved.
34
+ * Copyright (c) 2020 Red Hat, Inc.
59
+ *
35
+ *
60
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
36
+ * This work is licensed under the terms of the GNU GPL, version 2 or
61
+ * of this software and associated documentation files (the "Software"), to deal
37
+ * later. See the COPYING file in the top-level directory.
62
+ * in the Software without restriction, including without limitation the rights
38
+ */
63
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
39
+
64
+ * copies of the Software, and to permit persons to whom the Software is
40
+#ifndef VHOST_USER_SERVER_H
65
+ * furnished to do so, subject to the following conditions:
41
+#define VHOST_USER_SERVER_H
42
+
43
+#include "contrib/libvhost-user/libvhost-user.h"
44
+#include "io/channel-socket.h"
45
+#include "io/channel-file.h"
46
+#include "io/net-listener.h"
47
+#include "qemu/error-report.h"
48
+#include "qapi/error.h"
49
+#include "standard-headers/linux/virtio_blk.h"
50
+
51
+typedef struct VuFdWatch {
52
+ VuDev *vu_dev;
53
+ int fd; /*kick fd*/
54
+ void *pvt;
55
+ vu_watch_cb cb;
56
+ bool processing;
57
+ QTAILQ_ENTRY(VuFdWatch) next;
58
+} VuFdWatch;
59
+
60
+typedef struct VuServer VuServer;
61
+typedef void DevicePanicNotifierFn(VuServer *server);
62
+
63
+struct VuServer {
64
+ QIONetListener *listener;
65
+ AioContext *ctx;
66
+ DevicePanicNotifierFn *device_panic_notifier;
67
+ int max_queues;
68
+ const VuDevIface *vu_iface;
69
+ VuDev vu_dev;
70
+ QIOChannel *ioc; /* The I/O channel with the client */
71
+ QIOChannelSocket *sioc; /* The underlying data channel with the client */
72
+ /* IOChannel for fd provided via VHOST_USER_SET_SLAVE_REQ_FD */
73
+ QIOChannel *ioc_slave;
74
+ QIOChannelSocket *sioc_slave;
75
+ Coroutine *co_trip; /* coroutine for processing VhostUserMsg */
76
+ QTAILQ_HEAD(, VuFdWatch) vu_fd_watches;
77
+ /* restart coroutine co_trip if AIOContext is changed */
78
+ bool aio_context_changed;
79
+ bool processing_msg;
80
+};
81
+
82
+bool vhost_user_server_start(VuServer *server,
83
+ SocketAddress *unix_socket,
84
+ AioContext *ctx,
85
+ uint16_t max_queues,
86
+ DevicePanicNotifierFn *device_panic_notifier,
87
+ const VuDevIface *vu_iface,
88
+ Error **errp);
89
+
90
+void vhost_user_server_stop(VuServer *server);
91
+
92
+void vhost_user_server_set_aio_context(VuServer *server, AioContext *ctx);
93
+
94
+#endif /* VHOST_USER_SERVER_H */
95
diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
96
new file mode 100644
97
index XXXXXXX..XXXXXXX
98
--- /dev/null
99
+++ b/util/vhost-user-server.c
100
@@ -XXX,XX +XXX,XX @@
101
+/*
102
+ * Sharing QEMU devices via vhost-user protocol
66
+ *
103
+ *
67
+ * The above copyright notice and this permission notice shall be included in
104
+ * Copyright (c) Coiby Xu <coiby.xu@gmail.com>.
68
+ * all copies or substantial portions of the Software.
105
+ * Copyright (c) 2020 Red Hat, Inc.
69
+ *
106
+ *
70
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
107
+ * This work is licensed under the terms of the GNU GPL, version 2 or
71
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
108
+ * later. See the COPYING file in the top-level directory.
72
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
73
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
74
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
75
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
76
+ * THE SOFTWARE.
77
+ */
109
+ */
78
+
79
+#include "qemu/osdep.h"
110
+#include "qemu/osdep.h"
80
+
111
+#include "qemu/main-loop.h"
81
+#define ZLIB_CONST
112
+#include "vhost-user-server.h"
82
+#include <zlib.h>
113
+
83
+
114
+static void vmsg_close_fds(VhostUserMsg *vmsg)
84
+#include "qcow2.h"
115
+{
85
+#include "block/thread-pool.h"
116
+ int i;
86
+
117
+ for (i = 0; i < vmsg->fd_num; i++) {
87
+#define MAX_COMPRESS_THREADS 4
118
+ close(vmsg->fds[i]);
88
+
119
+ }
89
+typedef ssize_t (*Qcow2CompressFunc)(void *dest, size_t dest_size,
120
+}
90
+ const void *src, size_t src_size);
121
+
91
+typedef struct Qcow2CompressData {
122
+static void vmsg_unblock_fds(VhostUserMsg *vmsg)
92
+ void *dest;
123
+{
93
+ size_t dest_size;
124
+ int i;
94
+ const void *src;
125
+ for (i = 0; i < vmsg->fd_num; i++) {
95
+ size_t src_size;
126
+ qemu_set_nonblock(vmsg->fds[i]);
96
+ ssize_t ret;
127
+ }
97
+
128
+}
98
+ Qcow2CompressFunc func;
129
+
99
+} Qcow2CompressData;
130
+static void vu_accept(QIONetListener *listener, QIOChannelSocket *sioc,
131
+ gpointer opaque);
132
+
133
+static void close_client(VuServer *server)
134
+{
135
+ /*
136
+ * Before closing the client
137
+ *
138
+ * 1. Let vu_client_trip stop processing new vhost-user msg
139
+ *
140
+ * 2. remove kick_handler
141
+ *
142
+ * 3. wait for the kick handler to be finished
143
+ *
144
+ * 4. wait for the current vhost-user msg to be finished processing
145
+ */
146
+
147
+ QIOChannelSocket *sioc = server->sioc;
148
+ /* When this is set vu_client_trip will stop new processing vhost-user message */
149
+ server->sioc = NULL;
150
+
151
+ VuFdWatch *vu_fd_watch, *next;
152
+ QTAILQ_FOREACH_SAFE(vu_fd_watch, &server->vu_fd_watches, next, next) {
153
+ aio_set_fd_handler(server->ioc->ctx, vu_fd_watch->fd, true, NULL,
154
+ NULL, NULL, NULL);
155
+ }
156
+
157
+ while (!QTAILQ_EMPTY(&server->vu_fd_watches)) {
158
+ QTAILQ_FOREACH_SAFE(vu_fd_watch, &server->vu_fd_watches, next, next) {
159
+ if (!vu_fd_watch->processing) {
160
+ QTAILQ_REMOVE(&server->vu_fd_watches, vu_fd_watch, next);
161
+ g_free(vu_fd_watch);
162
+ }
163
+ }
164
+ }
165
+
166
+ while (server->processing_msg) {
167
+ if (server->ioc->read_coroutine) {
168
+ server->ioc->read_coroutine = NULL;
169
+ qio_channel_set_aio_fd_handler(server->ioc, server->ioc->ctx, NULL,
170
+ NULL, server->ioc);
171
+ server->processing_msg = false;
172
+ }
173
+ }
174
+
175
+ vu_deinit(&server->vu_dev);
176
+ object_unref(OBJECT(sioc));
177
+ object_unref(OBJECT(server->ioc));
178
+}
179
+
180
+static void panic_cb(VuDev *vu_dev, const char *buf)
181
+{
182
+ VuServer *server = container_of(vu_dev, VuServer, vu_dev);
183
+
184
+ /* avoid while loop in close_client */
185
+ server->processing_msg = false;
186
+
187
+ if (buf) {
188
+ error_report("vu_panic: %s", buf);
189
+ }
190
+
191
+ if (server->sioc) {
192
+ close_client(server);
193
+ }
194
+
195
+ if (server->device_panic_notifier) {
196
+ server->device_panic_notifier(server);
197
+ }
198
+
199
+ /*
200
+ * Set the callback function for network listener so another
201
+ * vhost-user client can connect to this server
202
+ */
203
+ qio_net_listener_set_client_func(server->listener,
204
+ vu_accept,
205
+ server,
206
+ NULL);
207
+}
208
+
209
+static bool coroutine_fn
210
+vu_message_read(VuDev *vu_dev, int conn_fd, VhostUserMsg *vmsg)
211
+{
212
+ struct iovec iov = {
213
+ .iov_base = (char *)vmsg,
214
+ .iov_len = VHOST_USER_HDR_SIZE,
215
+ };
216
+ int rc, read_bytes = 0;
217
+ Error *local_err = NULL;
218
+ /*
219
+ * Store fds/nfds returned from qio_channel_readv_full into
220
+ * temporary variables.
221
+ *
222
+ * VhostUserMsg is a packed structure, gcc will complain about passing
223
+ * pointer to a packed structure member if we pass &VhostUserMsg.fd_num
224
+ * and &VhostUserMsg.fds directly when calling qio_channel_readv_full,
225
+ * thus two temporary variables nfds and fds are used here.
226
+ */
227
+ size_t nfds = 0, nfds_t = 0;
228
+ const size_t max_fds = G_N_ELEMENTS(vmsg->fds);
229
+ int *fds_t = NULL;
230
+ VuServer *server = container_of(vu_dev, VuServer, vu_dev);
231
+ QIOChannel *ioc = server->ioc;
232
+
233
+ if (!ioc) {
234
+ error_report_err(local_err);
235
+ goto fail;
236
+ }
237
+
238
+ assert(qemu_in_coroutine());
239
+ do {
240
+ /*
241
+ * qio_channel_readv_full may have short reads, keeping calling it
242
+ * until getting VHOST_USER_HDR_SIZE or 0 bytes in total
243
+ */
244
+ rc = qio_channel_readv_full(ioc, &iov, 1, &fds_t, &nfds_t, &local_err);
245
+ if (rc < 0) {
246
+ if (rc == QIO_CHANNEL_ERR_BLOCK) {
247
+ qio_channel_yield(ioc, G_IO_IN);
248
+ continue;
249
+ } else {
250
+ error_report_err(local_err);
251
+ return false;
252
+ }
253
+ }
254
+ read_bytes += rc;
255
+ if (nfds_t > 0) {
256
+ if (nfds + nfds_t > max_fds) {
257
+ error_report("A maximum of %zu fds are allowed, "
258
+ "however got %zu fds now",
259
+ max_fds, nfds + nfds_t);
260
+ goto fail;
261
+ }
262
+ memcpy(vmsg->fds + nfds, fds_t,
263
+ nfds_t *sizeof(vmsg->fds[0]));
264
+ nfds += nfds_t;
265
+ g_free(fds_t);
266
+ }
267
+ if (read_bytes == VHOST_USER_HDR_SIZE || rc == 0) {
268
+ break;
269
+ }
270
+ iov.iov_base = (char *)vmsg + read_bytes;
271
+ iov.iov_len = VHOST_USER_HDR_SIZE - read_bytes;
272
+ } while (true);
273
+
274
+ vmsg->fd_num = nfds;
275
+ /* qio_channel_readv_full will make socket fds blocking, unblock them */
276
+ vmsg_unblock_fds(vmsg);
277
+ if (vmsg->size > sizeof(vmsg->payload)) {
278
+ error_report("Error: too big message request: %d, "
279
+ "size: vmsg->size: %u, "
280
+ "while sizeof(vmsg->payload) = %zu",
281
+ vmsg->request, vmsg->size, sizeof(vmsg->payload));
282
+ goto fail;
283
+ }
284
+
285
+ struct iovec iov_payload = {
286
+ .iov_base = (char *)&vmsg->payload,
287
+ .iov_len = vmsg->size,
288
+ };
289
+ if (vmsg->size) {
290
+ rc = qio_channel_readv_all_eof(ioc, &iov_payload, 1, &local_err);
291
+ if (rc == -1) {
292
+ error_report_err(local_err);
293
+ goto fail;
294
+ }
295
+ }
296
+
297
+ return true;
298
+
299
+fail:
300
+ vmsg_close_fds(vmsg);
301
+
302
+ return false;
303
+}
304
+
305
+
306
+static void vu_client_start(VuServer *server);
307
+static coroutine_fn void vu_client_trip(void *opaque)
308
+{
309
+ VuServer *server = opaque;
310
+
311
+ while (!server->aio_context_changed && server->sioc) {
312
+ server->processing_msg = true;
313
+ vu_dispatch(&server->vu_dev);
314
+ server->processing_msg = false;
315
+ }
316
+
317
+ if (server->aio_context_changed && server->sioc) {
318
+ server->aio_context_changed = false;
319
+ vu_client_start(server);
320
+ }
321
+}
322
+
323
+static void vu_client_start(VuServer *server)
324
+{
325
+ server->co_trip = qemu_coroutine_create(vu_client_trip, server);
326
+ aio_co_enter(server->ctx, server->co_trip);
327
+}
100
+
328
+
101
+/*
329
+/*
102
+ * qcow2_compress()
330
+ * a wrapper for vu_kick_cb
103
+ *
331
+ *
104
+ * @dest - destination buffer, @dest_size bytes
332
+ * since aio_dispatch can only pass one user data pointer to the
105
+ * @src - source buffer, @src_size bytes
333
+ * callback function, pack VuDev and pvt into a struct. Then unpack it
106
+ *
334
+ * and pass them to vu_kick_cb
107
+ * Returns: compressed size on success
108
+ * -ENOMEM destination buffer is not enough to store compressed data
109
+ * -EIO on any other error
110
+ */
335
+ */
111
+static ssize_t qcow2_compress(void *dest, size_t dest_size,
336
+static void kick_handler(void *opaque)
112
+ const void *src, size_t src_size)
337
+{
113
+{
338
+ VuFdWatch *vu_fd_watch = opaque;
114
+ ssize_t ret;
339
+ vu_fd_watch->processing = true;
115
+ z_stream strm;
340
+ vu_fd_watch->cb(vu_fd_watch->vu_dev, 0, vu_fd_watch->pvt);
116
+
341
+ vu_fd_watch->processing = false;
117
+ /* best compression, small window, no zlib header */
342
+}
118
+ memset(&strm, 0, sizeof(strm));
343
+
119
+ ret = deflateInit2(&strm, Z_DEFAULT_COMPRESSION, Z_DEFLATED,
344
+
120
+ -12, 9, Z_DEFAULT_STRATEGY);
345
+static VuFdWatch *find_vu_fd_watch(VuServer *server, int fd)
121
+ if (ret != Z_OK) {
346
+{
122
+ return -EIO;
347
+
348
+ VuFdWatch *vu_fd_watch, *next;
349
+ QTAILQ_FOREACH_SAFE(vu_fd_watch, &server->vu_fd_watches, next, next) {
350
+ if (vu_fd_watch->fd == fd) {
351
+ return vu_fd_watch;
352
+ }
353
+ }
354
+ return NULL;
355
+}
356
+
357
+static void
358
+set_watch(VuDev *vu_dev, int fd, int vu_evt,
359
+ vu_watch_cb cb, void *pvt)
360
+{
361
+
362
+ VuServer *server = container_of(vu_dev, VuServer, vu_dev);
363
+ g_assert(vu_dev);
364
+ g_assert(fd >= 0);
365
+ g_assert(cb);
366
+
367
+ VuFdWatch *vu_fd_watch = find_vu_fd_watch(server, fd);
368
+
369
+ if (!vu_fd_watch) {
370
+ VuFdWatch *vu_fd_watch = g_new0(VuFdWatch, 1);
371
+
372
+ QTAILQ_INSERT_TAIL(&server->vu_fd_watches, vu_fd_watch, next);
373
+
374
+ vu_fd_watch->fd = fd;
375
+ vu_fd_watch->cb = cb;
376
+ qemu_set_nonblock(fd);
377
+ aio_set_fd_handler(server->ioc->ctx, fd, true, kick_handler,
378
+ NULL, NULL, vu_fd_watch);
379
+ vu_fd_watch->vu_dev = vu_dev;
380
+ vu_fd_watch->pvt = pvt;
381
+ }
382
+}
383
+
384
+
385
+static void remove_watch(VuDev *vu_dev, int fd)
386
+{
387
+ VuServer *server;
388
+ g_assert(vu_dev);
389
+ g_assert(fd >= 0);
390
+
391
+ server = container_of(vu_dev, VuServer, vu_dev);
392
+
393
+ VuFdWatch *vu_fd_watch = find_vu_fd_watch(server, fd);
394
+
395
+ if (!vu_fd_watch) {
396
+ return;
397
+ }
398
+ aio_set_fd_handler(server->ioc->ctx, fd, true, NULL, NULL, NULL, NULL);
399
+
400
+ QTAILQ_REMOVE(&server->vu_fd_watches, vu_fd_watch, next);
401
+ g_free(vu_fd_watch);
402
+}
403
+
404
+
405
+static void vu_accept(QIONetListener *listener, QIOChannelSocket *sioc,
406
+ gpointer opaque)
407
+{
408
+ VuServer *server = opaque;
409
+
410
+ if (server->sioc) {
411
+ warn_report("Only one vhost-user client is allowed to "
412
+ "connect the server one time");
413
+ return;
414
+ }
415
+
416
+ if (!vu_init(&server->vu_dev, server->max_queues, sioc->fd, panic_cb,
417
+ vu_message_read, set_watch, remove_watch, server->vu_iface)) {
418
+ error_report("Failed to initialize libvhost-user");
419
+ return;
123
+ }
420
+ }
124
+
421
+
125
+ /*
422
+ /*
126
+ * strm.next_in is not const in old zlib versions, such as those used on
423
+ * Unset the callback function for network listener to make another
127
+ * OpenBSD/NetBSD, so cast the const away
424
+ * vhost-user client keeping waiting until this client disconnects
128
+ */
425
+ */
129
+ strm.avail_in = src_size;
426
+ qio_net_listener_set_client_func(server->listener,
130
+ strm.next_in = (void *) src;
427
+ NULL,
131
+ strm.avail_out = dest_size;
428
+ NULL,
132
+ strm.next_out = dest;
429
+ NULL);
133
+
430
+ server->sioc = sioc;
134
+ ret = deflate(&strm, Z_FINISH);
431
+ /*
135
+ if (ret == Z_STREAM_END) {
432
+ * Increase the object reference, so sioc will not freed by
136
+ ret = dest_size - strm.avail_out;
433
+ * qio_net_listener_channel_func which will call object_unref(OBJECT(sioc))
434
+ */
435
+ object_ref(OBJECT(server->sioc));
436
+ qio_channel_set_name(QIO_CHANNEL(sioc), "vhost-user client");
437
+ server->ioc = QIO_CHANNEL(sioc);
438
+ object_ref(OBJECT(server->ioc));
439
+ qio_channel_attach_aio_context(server->ioc, server->ctx);
440
+ qio_channel_set_blocking(QIO_CHANNEL(server->sioc), false, NULL);
441
+ vu_client_start(server);
442
+}
443
+
444
+
445
+void vhost_user_server_stop(VuServer *server)
446
+{
447
+ if (server->sioc) {
448
+ close_client(server);
449
+ }
450
+
451
+ if (server->listener) {
452
+ qio_net_listener_disconnect(server->listener);
453
+ object_unref(OBJECT(server->listener));
454
+ }
455
+
456
+}
457
+
458
+void vhost_user_server_set_aio_context(VuServer *server, AioContext *ctx)
459
+{
460
+ VuFdWatch *vu_fd_watch, *next;
461
+ void *opaque = NULL;
462
+ IOHandler *io_read = NULL;
463
+ bool attach;
464
+
465
+ server->ctx = ctx ? ctx : qemu_get_aio_context();
466
+
467
+ if (!server->sioc) {
468
+ /* not yet serving any client*/
469
+ return;
470
+ }
471
+
472
+ if (ctx) {
473
+ qio_channel_attach_aio_context(server->ioc, ctx);
474
+ server->aio_context_changed = true;
475
+ io_read = kick_handler;
476
+ attach = true;
137
+ } else {
477
+ } else {
138
+ ret = (ret == Z_OK ? -ENOMEM : -EIO);
478
+ qio_channel_detach_aio_context(server->ioc);
139
+ }
479
+ /* server->ioc->ctx keeps the old AioConext */
140
+
480
+ ctx = server->ioc->ctx;
141
+ deflateEnd(&strm);
481
+ attach = false;
142
+
482
+ }
143
+ return ret;
483
+
144
+}
484
+ QTAILQ_FOREACH_SAFE(vu_fd_watch, &server->vu_fd_watches, next, next) {
145
+
485
+ if (vu_fd_watch->cb) {
146
+/*
486
+ opaque = attach ? vu_fd_watch : NULL;
147
+ * qcow2_decompress()
487
+ aio_set_fd_handler(ctx, vu_fd_watch->fd, true,
148
+ *
488
+ io_read, NULL, NULL,
149
+ * Decompress some data (not more than @src_size bytes) to produce exactly
489
+ opaque);
150
+ * @dest_size bytes.
490
+ }
151
+ *
491
+ }
152
+ * @dest - destination buffer, @dest_size bytes
492
+}
153
+ * @src - source buffer, @src_size bytes
493
+
154
+ *
494
+
155
+ * Returns: 0 on success
495
+bool vhost_user_server_start(VuServer *server,
156
+ * -1 on fail
496
+ SocketAddress *socket_addr,
157
+ */
497
+ AioContext *ctx,
158
+static ssize_t qcow2_decompress(void *dest, size_t dest_size,
498
+ uint16_t max_queues,
159
+ const void *src, size_t src_size)
499
+ DevicePanicNotifierFn *device_panic_notifier,
160
+{
500
+ const VuDevIface *vu_iface,
161
+ int ret = 0;
501
+ Error **errp)
162
+ z_stream strm;
502
+{
163
+
503
+ QIONetListener *listener = qio_net_listener_new();
164
+ memset(&strm, 0, sizeof(strm));
504
+ if (qio_net_listener_open_sync(listener, socket_addr, 1,
165
+ strm.avail_in = src_size;
505
+ errp) < 0) {
166
+ strm.next_in = (void *) src;
506
+ object_unref(OBJECT(listener));
167
+ strm.avail_out = dest_size;
507
+ return false;
168
+ strm.next_out = dest;
508
+ }
169
+
509
+
170
+ ret = inflateInit2(&strm, -12);
510
+ /* zero out unspecified fileds */
171
+ if (ret != Z_OK) {
511
+ *server = (VuServer) {
172
+ return -1;
512
+ .listener = listener,
173
+ }
513
+ .vu_iface = vu_iface,
174
+
514
+ .max_queues = max_queues,
175
+ ret = inflate(&strm, Z_FINISH);
515
+ .ctx = ctx,
176
+ if ((ret != Z_STREAM_END && ret != Z_BUF_ERROR) || strm.avail_out != 0) {
516
+ .device_panic_notifier = device_panic_notifier,
177
+ /*
178
+ * We approve Z_BUF_ERROR because we need @dest buffer to be filled, but
179
+ * @src buffer may be processed partly (because in qcow2 we know size of
180
+ * compressed data with precision of one sector)
181
+ */
182
+ ret = -1;
183
+ }
184
+
185
+ inflateEnd(&strm);
186
+
187
+ return ret;
188
+}
189
+
190
+static int qcow2_compress_pool_func(void *opaque)
191
+{
192
+ Qcow2CompressData *data = opaque;
193
+
194
+ data->ret = data->func(data->dest, data->dest_size,
195
+ data->src, data->src_size);
196
+
197
+ return 0;
198
+}
199
+
200
+static void qcow2_compress_complete(void *opaque, int ret)
201
+{
202
+ qemu_coroutine_enter(opaque);
203
+}
204
+
205
+static ssize_t coroutine_fn
206
+qcow2_co_do_compress(BlockDriverState *bs, void *dest, size_t dest_size,
207
+ const void *src, size_t src_size, Qcow2CompressFunc func)
208
+{
209
+ BDRVQcow2State *s = bs->opaque;
210
+ BlockAIOCB *acb;
211
+ ThreadPool *pool = aio_get_thread_pool(bdrv_get_aio_context(bs));
212
+ Qcow2CompressData arg = {
213
+ .dest = dest,
214
+ .dest_size = dest_size,
215
+ .src = src,
216
+ .src_size = src_size,
217
+ .func = func,
218
+ };
517
+ };
219
+
518
+
220
+ while (s->nb_compress_threads >= MAX_COMPRESS_THREADS) {
519
+ qio_net_listener_set_name(server->listener, "vhost-user-backend-listener");
221
+ qemu_co_queue_wait(&s->compress_wait_queue, NULL);
520
+
222
+ }
521
+ qio_net_listener_set_client_func(server->listener,
223
+
522
+ vu_accept,
224
+ s->nb_compress_threads++;
523
+ server,
225
+ acb = thread_pool_submit_aio(pool, qcow2_compress_pool_func, &arg,
524
+ NULL);
226
+ qcow2_compress_complete,
525
+
227
+ qemu_coroutine_self());
526
+ QTAILQ_INIT(&server->vu_fd_watches);
228
+
527
+ return true;
229
+ if (!acb) {
528
+}
230
+ s->nb_compress_threads--;
529
diff --git a/util/meson.build b/util/meson.build
231
+ return -EINVAL;
232
+ }
233
+ qemu_coroutine_yield();
234
+ s->nb_compress_threads--;
235
+ qemu_co_queue_next(&s->compress_wait_queue);
236
+
237
+ return arg.ret;
238
+}
239
+
240
+ssize_t coroutine_fn
241
+qcow2_co_compress(BlockDriverState *bs, void *dest, size_t dest_size,
242
+ const void *src, size_t src_size)
243
+{
244
+ return qcow2_co_do_compress(bs, dest, dest_size, src, src_size,
245
+ qcow2_compress);
246
+}
247
+
248
+ssize_t coroutine_fn
249
+qcow2_co_decompress(BlockDriverState *bs, void *dest, size_t dest_size,
250
+ const void *src, size_t src_size)
251
+{
252
+ return qcow2_co_do_compress(bs, dest, dest_size, src, src_size,
253
+ qcow2_decompress);
254
+}
255
diff --git a/block/qcow2.c b/block/qcow2.c
256
index XXXXXXX..XXXXXXX 100644
530
index XXXXXXX..XXXXXXX 100644
257
--- a/block/qcow2.c
531
--- a/util/meson.build
258
+++ b/block/qcow2.c
532
+++ b/util/meson.build
259
@@ -XXX,XX +XXX,XX @@
533
@@ -XXX,XX +XXX,XX @@ if have_block
260
534
util_ss.add(files('main-loop.c'))
261
#include "qemu/osdep.h"
535
util_ss.add(files('nvdimm-utils.c'))
262
536
util_ss.add(files('qemu-coroutine.c', 'qemu-coroutine-lock.c', 'qemu-coroutine-io.c'))
263
-#define ZLIB_CONST
537
+ util_ss.add(when: 'CONFIG_LINUX', if_true: files('vhost-user-server.c'))
264
-#include <zlib.h>
538
util_ss.add(files('qemu-coroutine-sleep.c'))
265
-
539
util_ss.add(files('qemu-co-shared-resource.c'))
266
#include "block/qdict.h"
540
util_ss.add(files('thread-pool.c', 'qemu-timer.c'))
267
#include "sysemu/block-backend.h"
268
#include "qemu/module.h"
269
@@ -XXX,XX +XXX,XX @@
270
#include "qapi/qobject-input-visitor.h"
271
#include "qapi/qapi-visit-block-core.h"
272
#include "crypto.h"
273
-#include "block/thread-pool.h"
274
275
/*
276
Differences with QCOW:
277
@@ -XXX,XX +XXX,XX @@ fail:
278
return ret;
279
}
280
281
-/*
282
- * qcow2_compress()
283
- *
284
- * @dest - destination buffer, @dest_size bytes
285
- * @src - source buffer, @src_size bytes
286
- *
287
- * Returns: compressed size on success
288
- * -ENOMEM destination buffer is not enough to store compressed data
289
- * -EIO on any other error
290
- */
291
-static ssize_t qcow2_compress(void *dest, size_t dest_size,
292
- const void *src, size_t src_size)
293
-{
294
- ssize_t ret;
295
- z_stream strm;
296
-
297
- /* best compression, small window, no zlib header */
298
- memset(&strm, 0, sizeof(strm));
299
- ret = deflateInit2(&strm, Z_DEFAULT_COMPRESSION, Z_DEFLATED,
300
- -12, 9, Z_DEFAULT_STRATEGY);
301
- if (ret != Z_OK) {
302
- return -EIO;
303
- }
304
-
305
- /* strm.next_in is not const in old zlib versions, such as those used on
306
- * OpenBSD/NetBSD, so cast the const away */
307
- strm.avail_in = src_size;
308
- strm.next_in = (void *) src;
309
- strm.avail_out = dest_size;
310
- strm.next_out = dest;
311
-
312
- ret = deflate(&strm, Z_FINISH);
313
- if (ret == Z_STREAM_END) {
314
- ret = dest_size - strm.avail_out;
315
- } else {
316
- ret = (ret == Z_OK ? -ENOMEM : -EIO);
317
- }
318
-
319
- deflateEnd(&strm);
320
-
321
- return ret;
322
-}
323
-
324
-/*
325
- * qcow2_decompress()
326
- *
327
- * Decompress some data (not more than @src_size bytes) to produce exactly
328
- * @dest_size bytes.
329
- *
330
- * @dest - destination buffer, @dest_size bytes
331
- * @src - source buffer, @src_size bytes
332
- *
333
- * Returns: 0 on success
334
- * -1 on fail
335
- */
336
-static ssize_t qcow2_decompress(void *dest, size_t dest_size,
337
- const void *src, size_t src_size)
338
-{
339
- int ret = 0;
340
- z_stream strm;
341
-
342
- memset(&strm, 0, sizeof(strm));
343
- strm.avail_in = src_size;
344
- strm.next_in = (void *) src;
345
- strm.avail_out = dest_size;
346
- strm.next_out = dest;
347
-
348
- ret = inflateInit2(&strm, -12);
349
- if (ret != Z_OK) {
350
- return -1;
351
- }
352
-
353
- ret = inflate(&strm, Z_FINISH);
354
- if ((ret != Z_STREAM_END && ret != Z_BUF_ERROR) || strm.avail_out != 0) {
355
- /* We approve Z_BUF_ERROR because we need @dest buffer to be filled, but
356
- * @src buffer may be processed partly (because in qcow2 we know size of
357
- * compressed data with precision of one sector) */
358
- ret = -1;
359
- }
360
-
361
- inflateEnd(&strm);
362
-
363
- return ret;
364
-}
365
-
366
-#define MAX_COMPRESS_THREADS 4
367
-
368
-typedef ssize_t (*Qcow2CompressFunc)(void *dest, size_t dest_size,
369
- const void *src, size_t src_size);
370
-typedef struct Qcow2CompressData {
371
- void *dest;
372
- size_t dest_size;
373
- const void *src;
374
- size_t src_size;
375
- ssize_t ret;
376
-
377
- Qcow2CompressFunc func;
378
-} Qcow2CompressData;
379
-
380
-static int qcow2_compress_pool_func(void *opaque)
381
-{
382
- Qcow2CompressData *data = opaque;
383
-
384
- data->ret = data->func(data->dest, data->dest_size,
385
- data->src, data->src_size);
386
-
387
- return 0;
388
-}
389
-
390
-static void qcow2_compress_complete(void *opaque, int ret)
391
-{
392
- qemu_coroutine_enter(opaque);
393
-}
394
-
395
-static ssize_t coroutine_fn
396
-qcow2_co_do_compress(BlockDriverState *bs, void *dest, size_t dest_size,
397
- const void *src, size_t src_size, Qcow2CompressFunc func)
398
-{
399
- BDRVQcow2State *s = bs->opaque;
400
- BlockAIOCB *acb;
401
- ThreadPool *pool = aio_get_thread_pool(bdrv_get_aio_context(bs));
402
- Qcow2CompressData arg = {
403
- .dest = dest,
404
- .dest_size = dest_size,
405
- .src = src,
406
- .src_size = src_size,
407
- .func = func,
408
- };
409
-
410
- while (s->nb_compress_threads >= MAX_COMPRESS_THREADS) {
411
- qemu_co_queue_wait(&s->compress_wait_queue, NULL);
412
- }
413
-
414
- s->nb_compress_threads++;
415
- acb = thread_pool_submit_aio(pool, qcow2_compress_pool_func, &arg,
416
- qcow2_compress_complete,
417
- qemu_coroutine_self());
418
-
419
- if (!acb) {
420
- s->nb_compress_threads--;
421
- return -EINVAL;
422
- }
423
- qemu_coroutine_yield();
424
- s->nb_compress_threads--;
425
- qemu_co_queue_next(&s->compress_wait_queue);
426
-
427
- return arg.ret;
428
-}
429
-
430
-static ssize_t coroutine_fn
431
-qcow2_co_compress(BlockDriverState *bs, void *dest, size_t dest_size,
432
- const void *src, size_t src_size)
433
-{
434
- return qcow2_co_do_compress(bs, dest, dest_size, src, src_size,
435
- qcow2_compress);
436
-}
437
-
438
-static ssize_t coroutine_fn
439
-qcow2_co_decompress(BlockDriverState *bs, void *dest, size_t dest_size,
440
- const void *src, size_t src_size)
441
-{
442
- return qcow2_co_do_compress(bs, dest, dest_size, src, src_size,
443
- qcow2_decompress);
444
-}
445
-
446
/* XXX: put compressed sectors first, then all the cluster aligned
447
tables to avoid losing bytes in alignment */
448
static coroutine_fn int
449
--
541
--
450
2.21.0
542
2.26.2
451
543
452
diff view generated by jsdifflib
New patch
1
From: Coiby Xu <coiby.xu@gmail.com>
1
2
3
Move the constants from hw/core/qdev-properties.c to
4
util/block-helpers.h so that knowledge of the min/max values is
5
6
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
7
Signed-off-by: Coiby Xu <coiby.xu@gmail.com>
8
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
9
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
10
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
11
Message-id: 20200918080912.321299-5-coiby.xu@gmail.com
12
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
13
---
14
util/block-helpers.h | 19 +++++++++++++
15
hw/core/qdev-properties-system.c | 31 ++++-----------------
16
util/block-helpers.c | 46 ++++++++++++++++++++++++++++++++
17
util/meson.build | 1 +
18
4 files changed, 71 insertions(+), 26 deletions(-)
19
create mode 100644 util/block-helpers.h
20
create mode 100644 util/block-helpers.c
21
22
diff --git a/util/block-helpers.h b/util/block-helpers.h
23
new file mode 100644
24
index XXXXXXX..XXXXXXX
25
--- /dev/null
26
+++ b/util/block-helpers.h
27
@@ -XXX,XX +XXX,XX @@
28
+#ifndef BLOCK_HELPERS_H
29
+#define BLOCK_HELPERS_H
30
+
31
+#include "qemu/units.h"
32
+
33
+/* lower limit is sector size */
34
+#define MIN_BLOCK_SIZE INT64_C(512)
35
+#define MIN_BLOCK_SIZE_STR "512 B"
36
+/*
37
+ * upper limit is arbitrary, 2 MiB looks sufficient for all sensible uses, and
38
+ * matches qcow2 cluster size limit
39
+ */
40
+#define MAX_BLOCK_SIZE (2 * MiB)
41
+#define MAX_BLOCK_SIZE_STR "2 MiB"
42
+
43
+void check_block_size(const char *id, const char *name, int64_t value,
44
+ Error **errp);
45
+
46
+#endif /* BLOCK_HELPERS_H */
47
diff --git a/hw/core/qdev-properties-system.c b/hw/core/qdev-properties-system.c
48
index XXXXXXX..XXXXXXX 100644
49
--- a/hw/core/qdev-properties-system.c
50
+++ b/hw/core/qdev-properties-system.c
51
@@ -XXX,XX +XXX,XX @@
52
#include "sysemu/blockdev.h"
53
#include "net/net.h"
54
#include "hw/pci/pci.h"
55
+#include "util/block-helpers.h"
56
57
static bool check_prop_still_unset(DeviceState *dev, const char *name,
58
const void *old_val, const char *new_val,
59
@@ -XXX,XX +XXX,XX @@ const PropertyInfo qdev_prop_losttickpolicy = {
60
61
/* --- blocksize --- */
62
63
-/* lower limit is sector size */
64
-#define MIN_BLOCK_SIZE 512
65
-#define MIN_BLOCK_SIZE_STR "512 B"
66
-/*
67
- * upper limit is arbitrary, 2 MiB looks sufficient for all sensible uses, and
68
- * matches qcow2 cluster size limit
69
- */
70
-#define MAX_BLOCK_SIZE (2 * MiB)
71
-#define MAX_BLOCK_SIZE_STR "2 MiB"
72
-
73
static void set_blocksize(Object *obj, Visitor *v, const char *name,
74
void *opaque, Error **errp)
75
{
76
@@ -XXX,XX +XXX,XX @@ static void set_blocksize(Object *obj, Visitor *v, const char *name,
77
Property *prop = opaque;
78
uint32_t *ptr = qdev_get_prop_ptr(dev, prop);
79
uint64_t value;
80
+ Error *local_err = NULL;
81
82
if (dev->realized) {
83
qdev_prop_set_after_realize(dev, name, errp);
84
@@ -XXX,XX +XXX,XX @@ static void set_blocksize(Object *obj, Visitor *v, const char *name,
85
if (!visit_type_size(v, name, &value, errp)) {
86
return;
87
}
88
- /* value of 0 means "unset" */
89
- if (value && (value < MIN_BLOCK_SIZE || value > MAX_BLOCK_SIZE)) {
90
- error_setg(errp,
91
- "Property %s.%s doesn't take value %" PRIu64
92
- " (minimum: " MIN_BLOCK_SIZE_STR
93
- ", maximum: " MAX_BLOCK_SIZE_STR ")",
94
- dev->id ? : "", name, value);
95
+ check_block_size(dev->id ? : "", name, value, &local_err);
96
+ if (local_err) {
97
+ error_propagate(errp, local_err);
98
return;
99
}
100
-
101
- /* We rely on power-of-2 blocksizes for bitmasks */
102
- if ((value & (value - 1)) != 0) {
103
- error_setg(errp,
104
- "Property %s.%s doesn't take value '%" PRId64 "', "
105
- "it's not a power of 2", dev->id ?: "", name, (int64_t)value);
106
- return;
107
- }
108
-
109
*ptr = value;
110
}
111
112
diff --git a/util/block-helpers.c b/util/block-helpers.c
113
new file mode 100644
114
index XXXXXXX..XXXXXXX
115
--- /dev/null
116
+++ b/util/block-helpers.c
117
@@ -XXX,XX +XXX,XX @@
118
+/*
119
+ * Block utility functions
120
+ *
121
+ * Copyright IBM, Corp. 2011
122
+ * Copyright (c) 2020 Coiby Xu <coiby.xu@gmail.com>
123
+ *
124
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
125
+ * See the COPYING file in the top-level directory.
126
+ */
127
+
128
+#include "qemu/osdep.h"
129
+#include "qapi/error.h"
130
+#include "qapi/qmp/qerror.h"
131
+#include "block-helpers.h"
132
+
133
+/**
134
+ * check_block_size:
135
+ * @id: The unique ID of the object
136
+ * @name: The name of the property being validated
137
+ * @value: The block size in bytes
138
+ * @errp: A pointer to an area to store an error
139
+ *
140
+ * This function checks that the block size meets the following conditions:
141
+ * 1. At least MIN_BLOCK_SIZE
142
+ * 2. No larger than MAX_BLOCK_SIZE
143
+ * 3. A power of 2
144
+ */
145
+void check_block_size(const char *id, const char *name, int64_t value,
146
+ Error **errp)
147
+{
148
+ /* value of 0 means "unset" */
149
+ if (value && (value < MIN_BLOCK_SIZE || value > MAX_BLOCK_SIZE)) {
150
+ error_setg(errp, QERR_PROPERTY_VALUE_OUT_OF_RANGE,
151
+ id, name, value, MIN_BLOCK_SIZE, MAX_BLOCK_SIZE);
152
+ return;
153
+ }
154
+
155
+ /* We rely on power-of-2 blocksizes for bitmasks */
156
+ if ((value & (value - 1)) != 0) {
157
+ error_setg(errp,
158
+ "Property %s.%s doesn't take value '%" PRId64
159
+ "', it's not a power of 2",
160
+ id, name, value);
161
+ return;
162
+ }
163
+}
164
diff --git a/util/meson.build b/util/meson.build
165
index XXXXXXX..XXXXXXX 100644
166
--- a/util/meson.build
167
+++ b/util/meson.build
168
@@ -XXX,XX +XXX,XX @@ if have_block
169
util_ss.add(files('nvdimm-utils.c'))
170
util_ss.add(files('qemu-coroutine.c', 'qemu-coroutine-lock.c', 'qemu-coroutine-io.c'))
171
util_ss.add(when: 'CONFIG_LINUX', if_true: files('vhost-user-server.c'))
172
+ util_ss.add(files('block-helpers.c'))
173
util_ss.add(files('qemu-coroutine-sleep.c'))
174
util_ss.add(files('qemu-co-shared-resource.c'))
175
util_ss.add(files('thread-pool.c', 'qemu-timer.c'))
176
--
177
2.26.2
178
diff view generated by jsdifflib
1
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
1
From: Coiby Xu <coiby.xu@gmail.com>
2
2
3
Move generic part out of qcow2_co_do_compress, to reuse it for
3
By making use of libvhost-user, block device drive can be shared to
4
encryption and rename things that would be shared with encryption path.
4
the connected vhost-user client. Only one client can connect to the
5
server one time.
5
6
6
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
7
Since vhost-user-server needs a block drive to be created first, delay
7
Reviewed-by: Alberto Garcia <berto@igalia.com>
8
the creation of this object.
8
Reviewed-by: Max Reitz <mreitz@redhat.com>
9
9
Message-id: 20190506142741.41731-6-vsementsov@virtuozzo.com
10
Suggested-by: Kevin Wolf <kwolf@redhat.com>
10
Signed-off-by: Max Reitz <mreitz@redhat.com>
11
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
12
Signed-off-by: Coiby Xu <coiby.xu@gmail.com>
13
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
14
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
15
Message-id: 20200918080912.321299-6-coiby.xu@gmail.com
16
[Shorten "vhost_user_blk_server" string to "vhost_user_blk" to avoid the
17
following compiler warning:
18
../block/export/vhost-user-blk-server.c:178:50: error: ‘%s’ directive output truncated writing 21 bytes into a region of size 20 [-Werror=format-truncation=]
19
and fix "Invalid size %ld ..." ssize_t format string arguments for
20
32-bit hosts.
21
--Stefan]
22
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
11
---
23
---
12
block/qcow2.h | 4 ++--
24
block/export/vhost-user-blk-server.h | 36 ++
13
block/qcow2-threads.c | 47 ++++++++++++++++++++++++++++---------------
25
block/export/vhost-user-blk-server.c | 661 +++++++++++++++++++++++++++
14
block/qcow2.c | 2 +-
26
softmmu/vl.c | 4 +
15
3 files changed, 34 insertions(+), 19 deletions(-)
27
block/meson.build | 1 +
28
4 files changed, 702 insertions(+)
29
create mode 100644 block/export/vhost-user-blk-server.h
30
create mode 100644 block/export/vhost-user-blk-server.c
16
31
17
diff --git a/block/qcow2.h b/block/qcow2.h
32
diff --git a/block/export/vhost-user-blk-server.h b/block/export/vhost-user-blk-server.h
33
new file mode 100644
34
index XXXXXXX..XXXXXXX
35
--- /dev/null
36
+++ b/block/export/vhost-user-blk-server.h
37
@@ -XXX,XX +XXX,XX @@
38
+/*
39
+ * Sharing QEMU block devices via vhost-user protocal
40
+ *
41
+ * Copyright (c) Coiby Xu <coiby.xu@gmail.com>.
42
+ * Copyright (c) 2020 Red Hat, Inc.
43
+ *
44
+ * This work is licensed under the terms of the GNU GPL, version 2 or
45
+ * later. See the COPYING file in the top-level directory.
46
+ */
47
+
48
+#ifndef VHOST_USER_BLK_SERVER_H
49
+#define VHOST_USER_BLK_SERVER_H
50
+#include "util/vhost-user-server.h"
51
+
52
+typedef struct VuBlockDev VuBlockDev;
53
+#define TYPE_VHOST_USER_BLK_SERVER "vhost-user-blk-server"
54
+#define VHOST_USER_BLK_SERVER(obj) \
55
+ OBJECT_CHECK(VuBlockDev, obj, TYPE_VHOST_USER_BLK_SERVER)
56
+
57
+/* vhost user block device */
58
+struct VuBlockDev {
59
+ Object parent_obj;
60
+ char *node_name;
61
+ SocketAddress *addr;
62
+ AioContext *ctx;
63
+ VuServer vu_server;
64
+ bool running;
65
+ uint32_t blk_size;
66
+ BlockBackend *backend;
67
+ QIOChannelSocket *sioc;
68
+ QTAILQ_ENTRY(VuBlockDev) next;
69
+ struct virtio_blk_config blkcfg;
70
+ bool writable;
71
+};
72
+
73
+#endif /* VHOST_USER_BLK_SERVER_H */
74
diff --git a/block/export/vhost-user-blk-server.c b/block/export/vhost-user-blk-server.c
75
new file mode 100644
76
index XXXXXXX..XXXXXXX
77
--- /dev/null
78
+++ b/block/export/vhost-user-blk-server.c
79
@@ -XXX,XX +XXX,XX @@
80
+/*
81
+ * Sharing QEMU block devices via vhost-user protocal
82
+ *
83
+ * Parts of the code based on nbd/server.c.
84
+ *
85
+ * Copyright (c) Coiby Xu <coiby.xu@gmail.com>.
86
+ * Copyright (c) 2020 Red Hat, Inc.
87
+ *
88
+ * This work is licensed under the terms of the GNU GPL, version 2 or
89
+ * later. See the COPYING file in the top-level directory.
90
+ */
91
+#include "qemu/osdep.h"
92
+#include "block/block.h"
93
+#include "vhost-user-blk-server.h"
94
+#include "qapi/error.h"
95
+#include "qom/object_interfaces.h"
96
+#include "sysemu/block-backend.h"
97
+#include "util/block-helpers.h"
98
+
99
+enum {
100
+ VHOST_USER_BLK_MAX_QUEUES = 1,
101
+};
102
+struct virtio_blk_inhdr {
103
+ unsigned char status;
104
+};
105
+
106
+typedef struct VuBlockReq {
107
+ VuVirtqElement *elem;
108
+ int64_t sector_num;
109
+ size_t size;
110
+ struct virtio_blk_inhdr *in;
111
+ struct virtio_blk_outhdr out;
112
+ VuServer *server;
113
+ struct VuVirtq *vq;
114
+} VuBlockReq;
115
+
116
+static void vu_block_req_complete(VuBlockReq *req)
117
+{
118
+ VuDev *vu_dev = &req->server->vu_dev;
119
+
120
+ /* IO size with 1 extra status byte */
121
+ vu_queue_push(vu_dev, req->vq, req->elem, req->size + 1);
122
+ vu_queue_notify(vu_dev, req->vq);
123
+
124
+ if (req->elem) {
125
+ free(req->elem);
126
+ }
127
+
128
+ g_free(req);
129
+}
130
+
131
+static VuBlockDev *get_vu_block_device_by_server(VuServer *server)
132
+{
133
+ return container_of(server, VuBlockDev, vu_server);
134
+}
135
+
136
+static int coroutine_fn
137
+vu_block_discard_write_zeroes(VuBlockReq *req, struct iovec *iov,
138
+ uint32_t iovcnt, uint32_t type)
139
+{
140
+ struct virtio_blk_discard_write_zeroes desc;
141
+ ssize_t size = iov_to_buf(iov, iovcnt, 0, &desc, sizeof(desc));
142
+ if (unlikely(size != sizeof(desc))) {
143
+ error_report("Invalid size %zd, expect %zu", size, sizeof(desc));
144
+ return -EINVAL;
145
+ }
146
+
147
+ VuBlockDev *vdev_blk = get_vu_block_device_by_server(req->server);
148
+ uint64_t range[2] = { le64_to_cpu(desc.sector) << 9,
149
+ le32_to_cpu(desc.num_sectors) << 9 };
150
+ if (type == VIRTIO_BLK_T_DISCARD) {
151
+ if (blk_co_pdiscard(vdev_blk->backend, range[0], range[1]) == 0) {
152
+ return 0;
153
+ }
154
+ } else if (type == VIRTIO_BLK_T_WRITE_ZEROES) {
155
+ if (blk_co_pwrite_zeroes(vdev_blk->backend,
156
+ range[0], range[1], 0) == 0) {
157
+ return 0;
158
+ }
159
+ }
160
+
161
+ return -EINVAL;
162
+}
163
+
164
+static void coroutine_fn vu_block_flush(VuBlockReq *req)
165
+{
166
+ VuBlockDev *vdev_blk = get_vu_block_device_by_server(req->server);
167
+ BlockBackend *backend = vdev_blk->backend;
168
+ blk_co_flush(backend);
169
+}
170
+
171
+struct req_data {
172
+ VuServer *server;
173
+ VuVirtq *vq;
174
+ VuVirtqElement *elem;
175
+};
176
+
177
+static void coroutine_fn vu_block_virtio_process_req(void *opaque)
178
+{
179
+ struct req_data *data = opaque;
180
+ VuServer *server = data->server;
181
+ VuVirtq *vq = data->vq;
182
+ VuVirtqElement *elem = data->elem;
183
+ uint32_t type;
184
+ VuBlockReq *req;
185
+
186
+ VuBlockDev *vdev_blk = get_vu_block_device_by_server(server);
187
+ BlockBackend *backend = vdev_blk->backend;
188
+
189
+ struct iovec *in_iov = elem->in_sg;
190
+ struct iovec *out_iov = elem->out_sg;
191
+ unsigned in_num = elem->in_num;
192
+ unsigned out_num = elem->out_num;
193
+ /* refer to hw/block/virtio_blk.c */
194
+ if (elem->out_num < 1 || elem->in_num < 1) {
195
+ error_report("virtio-blk request missing headers");
196
+ free(elem);
197
+ return;
198
+ }
199
+
200
+ req = g_new0(VuBlockReq, 1);
201
+ req->server = server;
202
+ req->vq = vq;
203
+ req->elem = elem;
204
+
205
+ if (unlikely(iov_to_buf(out_iov, out_num, 0, &req->out,
206
+ sizeof(req->out)) != sizeof(req->out))) {
207
+ error_report("virtio-blk request outhdr too short");
208
+ goto err;
209
+ }
210
+
211
+ iov_discard_front(&out_iov, &out_num, sizeof(req->out));
212
+
213
+ if (in_iov[in_num - 1].iov_len < sizeof(struct virtio_blk_inhdr)) {
214
+ error_report("virtio-blk request inhdr too short");
215
+ goto err;
216
+ }
217
+
218
+ /* We always touch the last byte, so just see how big in_iov is. */
219
+ req->in = (void *)in_iov[in_num - 1].iov_base
220
+ + in_iov[in_num - 1].iov_len
221
+ - sizeof(struct virtio_blk_inhdr);
222
+ iov_discard_back(in_iov, &in_num, sizeof(struct virtio_blk_inhdr));
223
+
224
+ type = le32_to_cpu(req->out.type);
225
+ switch (type & ~VIRTIO_BLK_T_BARRIER) {
226
+ case VIRTIO_BLK_T_IN:
227
+ case VIRTIO_BLK_T_OUT: {
228
+ ssize_t ret = 0;
229
+ bool is_write = type & VIRTIO_BLK_T_OUT;
230
+ req->sector_num = le64_to_cpu(req->out.sector);
231
+
232
+ int64_t offset = req->sector_num * vdev_blk->blk_size;
233
+ QEMUIOVector qiov;
234
+ if (is_write) {
235
+ qemu_iovec_init_external(&qiov, out_iov, out_num);
236
+ ret = blk_co_pwritev(backend, offset, qiov.size,
237
+ &qiov, 0);
238
+ } else {
239
+ qemu_iovec_init_external(&qiov, in_iov, in_num);
240
+ ret = blk_co_preadv(backend, offset, qiov.size,
241
+ &qiov, 0);
242
+ }
243
+ if (ret >= 0) {
244
+ req->in->status = VIRTIO_BLK_S_OK;
245
+ } else {
246
+ req->in->status = VIRTIO_BLK_S_IOERR;
247
+ }
248
+ break;
249
+ }
250
+ case VIRTIO_BLK_T_FLUSH:
251
+ vu_block_flush(req);
252
+ req->in->status = VIRTIO_BLK_S_OK;
253
+ break;
254
+ case VIRTIO_BLK_T_GET_ID: {
255
+ size_t size = MIN(iov_size(&elem->in_sg[0], in_num),
256
+ VIRTIO_BLK_ID_BYTES);
257
+ snprintf(elem->in_sg[0].iov_base, size, "%s", "vhost_user_blk");
258
+ req->in->status = VIRTIO_BLK_S_OK;
259
+ req->size = elem->in_sg[0].iov_len;
260
+ break;
261
+ }
262
+ case VIRTIO_BLK_T_DISCARD:
263
+ case VIRTIO_BLK_T_WRITE_ZEROES: {
264
+ int rc;
265
+ rc = vu_block_discard_write_zeroes(req, &elem->out_sg[1],
266
+ out_num, type);
267
+ if (rc == 0) {
268
+ req->in->status = VIRTIO_BLK_S_OK;
269
+ } else {
270
+ req->in->status = VIRTIO_BLK_S_IOERR;
271
+ }
272
+ break;
273
+ }
274
+ default:
275
+ req->in->status = VIRTIO_BLK_S_UNSUPP;
276
+ break;
277
+ }
278
+
279
+ vu_block_req_complete(req);
280
+ return;
281
+
282
+err:
283
+ free(elem);
284
+ g_free(req);
285
+ return;
286
+}
287
+
288
+static void vu_block_process_vq(VuDev *vu_dev, int idx)
289
+{
290
+ VuServer *server;
291
+ VuVirtq *vq;
292
+ struct req_data *req_data;
293
+
294
+ server = container_of(vu_dev, VuServer, vu_dev);
295
+ assert(server);
296
+
297
+ vq = vu_get_queue(vu_dev, idx);
298
+ assert(vq);
299
+ VuVirtqElement *elem;
300
+ while (1) {
301
+ elem = vu_queue_pop(vu_dev, vq, sizeof(VuVirtqElement) +
302
+ sizeof(VuBlockReq));
303
+ if (elem) {
304
+ req_data = g_new0(struct req_data, 1);
305
+ req_data->server = server;
306
+ req_data->vq = vq;
307
+ req_data->elem = elem;
308
+ Coroutine *co = qemu_coroutine_create(vu_block_virtio_process_req,
309
+ req_data);
310
+ aio_co_enter(server->ioc->ctx, co);
311
+ } else {
312
+ break;
313
+ }
314
+ }
315
+}
316
+
317
+static void vu_block_queue_set_started(VuDev *vu_dev, int idx, bool started)
318
+{
319
+ VuVirtq *vq;
320
+
321
+ assert(vu_dev);
322
+
323
+ vq = vu_get_queue(vu_dev, idx);
324
+ vu_set_queue_handler(vu_dev, vq, started ? vu_block_process_vq : NULL);
325
+}
326
+
327
+static uint64_t vu_block_get_features(VuDev *dev)
328
+{
329
+ uint64_t features;
330
+ VuServer *server = container_of(dev, VuServer, vu_dev);
331
+ VuBlockDev *vdev_blk = get_vu_block_device_by_server(server);
332
+ features = 1ull << VIRTIO_BLK_F_SIZE_MAX |
333
+ 1ull << VIRTIO_BLK_F_SEG_MAX |
334
+ 1ull << VIRTIO_BLK_F_TOPOLOGY |
335
+ 1ull << VIRTIO_BLK_F_BLK_SIZE |
336
+ 1ull << VIRTIO_BLK_F_FLUSH |
337
+ 1ull << VIRTIO_BLK_F_DISCARD |
338
+ 1ull << VIRTIO_BLK_F_WRITE_ZEROES |
339
+ 1ull << VIRTIO_BLK_F_CONFIG_WCE |
340
+ 1ull << VIRTIO_F_VERSION_1 |
341
+ 1ull << VIRTIO_RING_F_INDIRECT_DESC |
342
+ 1ull << VIRTIO_RING_F_EVENT_IDX |
343
+ 1ull << VHOST_USER_F_PROTOCOL_FEATURES;
344
+
345
+ if (!vdev_blk->writable) {
346
+ features |= 1ull << VIRTIO_BLK_F_RO;
347
+ }
348
+
349
+ return features;
350
+}
351
+
352
+static uint64_t vu_block_get_protocol_features(VuDev *dev)
353
+{
354
+ return 1ull << VHOST_USER_PROTOCOL_F_CONFIG |
355
+ 1ull << VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD;
356
+}
357
+
358
+static int
359
+vu_block_get_config(VuDev *vu_dev, uint8_t *config, uint32_t len)
360
+{
361
+ VuServer *server = container_of(vu_dev, VuServer, vu_dev);
362
+ VuBlockDev *vdev_blk = get_vu_block_device_by_server(server);
363
+ memcpy(config, &vdev_blk->blkcfg, len);
364
+
365
+ return 0;
366
+}
367
+
368
+static int
369
+vu_block_set_config(VuDev *vu_dev, const uint8_t *data,
370
+ uint32_t offset, uint32_t size, uint32_t flags)
371
+{
372
+ VuServer *server = container_of(vu_dev, VuServer, vu_dev);
373
+ VuBlockDev *vdev_blk = get_vu_block_device_by_server(server);
374
+ uint8_t wce;
375
+
376
+ /* don't support live migration */
377
+ if (flags != VHOST_SET_CONFIG_TYPE_MASTER) {
378
+ return -EINVAL;
379
+ }
380
+
381
+ if (offset != offsetof(struct virtio_blk_config, wce) ||
382
+ size != 1) {
383
+ return -EINVAL;
384
+ }
385
+
386
+ wce = *data;
387
+ vdev_blk->blkcfg.wce = wce;
388
+ blk_set_enable_write_cache(vdev_blk->backend, wce);
389
+ return 0;
390
+}
391
+
392
+/*
393
+ * When the client disconnects, it sends a VHOST_USER_NONE request
394
+ * and vu_process_message will simple call exit which cause the VM
395
+ * to exit abruptly.
396
+ * To avoid this issue, process VHOST_USER_NONE request ahead
397
+ * of vu_process_message.
398
+ *
399
+ */
400
+static int vu_block_process_msg(VuDev *dev, VhostUserMsg *vmsg, int *do_reply)
401
+{
402
+ if (vmsg->request == VHOST_USER_NONE) {
403
+ dev->panic(dev, "disconnect");
404
+ return true;
405
+ }
406
+ return false;
407
+}
408
+
409
+static const VuDevIface vu_block_iface = {
410
+ .get_features = vu_block_get_features,
411
+ .queue_set_started = vu_block_queue_set_started,
412
+ .get_protocol_features = vu_block_get_protocol_features,
413
+ .get_config = vu_block_get_config,
414
+ .set_config = vu_block_set_config,
415
+ .process_msg = vu_block_process_msg,
416
+};
417
+
418
+static void blk_aio_attached(AioContext *ctx, void *opaque)
419
+{
420
+ VuBlockDev *vub_dev = opaque;
421
+ aio_context_acquire(ctx);
422
+ vhost_user_server_set_aio_context(&vub_dev->vu_server, ctx);
423
+ aio_context_release(ctx);
424
+}
425
+
426
+static void blk_aio_detach(void *opaque)
427
+{
428
+ VuBlockDev *vub_dev = opaque;
429
+ AioContext *ctx = vub_dev->vu_server.ctx;
430
+ aio_context_acquire(ctx);
431
+ vhost_user_server_set_aio_context(&vub_dev->vu_server, NULL);
432
+ aio_context_release(ctx);
433
+}
434
+
435
+static void
436
+vu_block_initialize_config(BlockDriverState *bs,
437
+ struct virtio_blk_config *config, uint32_t blk_size)
438
+{
439
+ config->capacity = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
440
+ config->blk_size = blk_size;
441
+ config->size_max = 0;
442
+ config->seg_max = 128 - 2;
443
+ config->min_io_size = 1;
444
+ config->opt_io_size = 1;
445
+ config->num_queues = VHOST_USER_BLK_MAX_QUEUES;
446
+ config->max_discard_sectors = 32768;
447
+ config->max_discard_seg = 1;
448
+ config->discard_sector_alignment = config->blk_size >> 9;
449
+ config->max_write_zeroes_sectors = 32768;
450
+ config->max_write_zeroes_seg = 1;
451
+}
452
+
453
+static VuBlockDev *vu_block_init(VuBlockDev *vu_block_device, Error **errp)
454
+{
455
+
456
+ BlockBackend *blk;
457
+ Error *local_error = NULL;
458
+ const char *node_name = vu_block_device->node_name;
459
+ bool writable = vu_block_device->writable;
460
+ uint64_t perm = BLK_PERM_CONSISTENT_READ;
461
+ int ret;
462
+
463
+ AioContext *ctx;
464
+
465
+ BlockDriverState *bs = bdrv_lookup_bs(node_name, node_name, &local_error);
466
+
467
+ if (!bs) {
468
+ error_propagate(errp, local_error);
469
+ return NULL;
470
+ }
471
+
472
+ if (bdrv_is_read_only(bs)) {
473
+ writable = false;
474
+ }
475
+
476
+ if (writable) {
477
+ perm |= BLK_PERM_WRITE;
478
+ }
479
+
480
+ ctx = bdrv_get_aio_context(bs);
481
+ aio_context_acquire(ctx);
482
+ bdrv_invalidate_cache(bs, NULL);
483
+ aio_context_release(ctx);
484
+
485
+ /*
486
+ * Don't allow resize while the vhost user server is running,
487
+ * otherwise we don't care what happens with the node.
488
+ */
489
+ blk = blk_new(bdrv_get_aio_context(bs), perm,
490
+ BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE_UNCHANGED |
491
+ BLK_PERM_WRITE | BLK_PERM_GRAPH_MOD);
492
+ ret = blk_insert_bs(blk, bs, errp);
493
+
494
+ if (ret < 0) {
495
+ goto fail;
496
+ }
497
+
498
+ blk_set_enable_write_cache(blk, false);
499
+
500
+ blk_set_allow_aio_context_change(blk, true);
501
+
502
+ vu_block_device->blkcfg.wce = 0;
503
+ vu_block_device->backend = blk;
504
+ if (!vu_block_device->blk_size) {
505
+ vu_block_device->blk_size = BDRV_SECTOR_SIZE;
506
+ }
507
+ vu_block_device->blkcfg.blk_size = vu_block_device->blk_size;
508
+ blk_set_guest_block_size(blk, vu_block_device->blk_size);
509
+ vu_block_initialize_config(bs, &vu_block_device->blkcfg,
510
+ vu_block_device->blk_size);
511
+ return vu_block_device;
512
+
513
+fail:
514
+ blk_unref(blk);
515
+ return NULL;
516
+}
517
+
518
+static void vu_block_deinit(VuBlockDev *vu_block_device)
519
+{
520
+ if (vu_block_device->backend) {
521
+ blk_remove_aio_context_notifier(vu_block_device->backend, blk_aio_attached,
522
+ blk_aio_detach, vu_block_device);
523
+ }
524
+
525
+ blk_unref(vu_block_device->backend);
526
+}
527
+
528
+static void vhost_user_blk_server_stop(VuBlockDev *vu_block_device)
529
+{
530
+ vhost_user_server_stop(&vu_block_device->vu_server);
531
+ vu_block_deinit(vu_block_device);
532
+}
533
+
534
+static void vhost_user_blk_server_start(VuBlockDev *vu_block_device,
535
+ Error **errp)
536
+{
537
+ AioContext *ctx;
538
+ SocketAddress *addr = vu_block_device->addr;
539
+
540
+ if (!vu_block_init(vu_block_device, errp)) {
541
+ return;
542
+ }
543
+
544
+ ctx = bdrv_get_aio_context(blk_bs(vu_block_device->backend));
545
+
546
+ if (!vhost_user_server_start(&vu_block_device->vu_server, addr, ctx,
547
+ VHOST_USER_BLK_MAX_QUEUES,
548
+ NULL, &vu_block_iface,
549
+ errp)) {
550
+ goto error;
551
+ }
552
+
553
+ blk_add_aio_context_notifier(vu_block_device->backend, blk_aio_attached,
554
+ blk_aio_detach, vu_block_device);
555
+ vu_block_device->running = true;
556
+ return;
557
+
558
+ error:
559
+ vu_block_deinit(vu_block_device);
560
+}
561
+
562
+static bool vu_prop_modifiable(VuBlockDev *vus, Error **errp)
563
+{
564
+ if (vus->running) {
565
+ error_setg(errp, "The property can't be modified "
566
+ "while the server is running");
567
+ return false;
568
+ }
569
+ return true;
570
+}
571
+
572
+static void vu_set_node_name(Object *obj, const char *value, Error **errp)
573
+{
574
+ VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
575
+
576
+ if (!vu_prop_modifiable(vus, errp)) {
577
+ return;
578
+ }
579
+
580
+ if (vus->node_name) {
581
+ g_free(vus->node_name);
582
+ }
583
+
584
+ vus->node_name = g_strdup(value);
585
+}
586
+
587
+static char *vu_get_node_name(Object *obj, Error **errp)
588
+{
589
+ VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
590
+ return g_strdup(vus->node_name);
591
+}
592
+
593
+static void free_socket_addr(SocketAddress *addr)
594
+{
595
+ g_free(addr->u.q_unix.path);
596
+ g_free(addr);
597
+}
598
+
599
+static void vu_set_unix_socket(Object *obj, const char *value,
600
+ Error **errp)
601
+{
602
+ VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
603
+
604
+ if (!vu_prop_modifiable(vus, errp)) {
605
+ return;
606
+ }
607
+
608
+ if (vus->addr) {
609
+ free_socket_addr(vus->addr);
610
+ }
611
+
612
+ SocketAddress *addr = g_new0(SocketAddress, 1);
613
+ addr->type = SOCKET_ADDRESS_TYPE_UNIX;
614
+ addr->u.q_unix.path = g_strdup(value);
615
+ vus->addr = addr;
616
+}
617
+
618
+static char *vu_get_unix_socket(Object *obj, Error **errp)
619
+{
620
+ VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
621
+ return g_strdup(vus->addr->u.q_unix.path);
622
+}
623
+
624
+static bool vu_get_block_writable(Object *obj, Error **errp)
625
+{
626
+ VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
627
+ return vus->writable;
628
+}
629
+
630
+static void vu_set_block_writable(Object *obj, bool value, Error **errp)
631
+{
632
+ VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
633
+
634
+ if (!vu_prop_modifiable(vus, errp)) {
635
+ return;
636
+ }
637
+
638
+ vus->writable = value;
639
+}
640
+
641
+static void vu_get_blk_size(Object *obj, Visitor *v, const char *name,
642
+ void *opaque, Error **errp)
643
+{
644
+ VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
645
+ uint32_t value = vus->blk_size;
646
+
647
+ visit_type_uint32(v, name, &value, errp);
648
+}
649
+
650
+static void vu_set_blk_size(Object *obj, Visitor *v, const char *name,
651
+ void *opaque, Error **errp)
652
+{
653
+ VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
654
+
655
+ Error *local_err = NULL;
656
+ uint32_t value;
657
+
658
+ if (!vu_prop_modifiable(vus, errp)) {
659
+ return;
660
+ }
661
+
662
+ visit_type_uint32(v, name, &value, &local_err);
663
+ if (local_err) {
664
+ goto out;
665
+ }
666
+
667
+ check_block_size(object_get_typename(obj), name, value, &local_err);
668
+ if (local_err) {
669
+ goto out;
670
+ }
671
+
672
+ vus->blk_size = value;
673
+
674
+out:
675
+ error_propagate(errp, local_err);
676
+}
677
+
678
+static void vhost_user_blk_server_instance_finalize(Object *obj)
679
+{
680
+ VuBlockDev *vub = VHOST_USER_BLK_SERVER(obj);
681
+
682
+ vhost_user_blk_server_stop(vub);
683
+
684
+ /*
685
+ * Unlike object_property_add_str, object_class_property_add_str
686
+ * doesn't have a release method. Thus manual memory freeing is
687
+ * needed.
688
+ */
689
+ free_socket_addr(vub->addr);
690
+ g_free(vub->node_name);
691
+}
692
+
693
+static void vhost_user_blk_server_complete(UserCreatable *obj, Error **errp)
694
+{
695
+ VuBlockDev *vub = VHOST_USER_BLK_SERVER(obj);
696
+
697
+ vhost_user_blk_server_start(vub, errp);
698
+}
699
+
700
+static void vhost_user_blk_server_class_init(ObjectClass *klass,
701
+ void *class_data)
702
+{
703
+ UserCreatableClass *ucc = USER_CREATABLE_CLASS(klass);
704
+ ucc->complete = vhost_user_blk_server_complete;
705
+
706
+ object_class_property_add_bool(klass, "writable",
707
+ vu_get_block_writable,
708
+ vu_set_block_writable);
709
+
710
+ object_class_property_add_str(klass, "node-name",
711
+ vu_get_node_name,
712
+ vu_set_node_name);
713
+
714
+ object_class_property_add_str(klass, "unix-socket",
715
+ vu_get_unix_socket,
716
+ vu_set_unix_socket);
717
+
718
+ object_class_property_add(klass, "logical-block-size", "uint32",
719
+ vu_get_blk_size, vu_set_blk_size,
720
+ NULL, NULL);
721
+}
722
+
723
+static const TypeInfo vhost_user_blk_server_info = {
724
+ .name = TYPE_VHOST_USER_BLK_SERVER,
725
+ .parent = TYPE_OBJECT,
726
+ .instance_size = sizeof(VuBlockDev),
727
+ .instance_finalize = vhost_user_blk_server_instance_finalize,
728
+ .class_init = vhost_user_blk_server_class_init,
729
+ .interfaces = (InterfaceInfo[]) {
730
+ {TYPE_USER_CREATABLE},
731
+ {}
732
+ },
733
+};
734
+
735
+static void vhost_user_blk_server_register_types(void)
736
+{
737
+ type_register_static(&vhost_user_blk_server_info);
738
+}
739
+
740
+type_init(vhost_user_blk_server_register_types)
741
diff --git a/softmmu/vl.c b/softmmu/vl.c
18
index XXXXXXX..XXXXXXX 100644
742
index XXXXXXX..XXXXXXX 100644
19
--- a/block/qcow2.h
743
--- a/softmmu/vl.c
20
+++ b/block/qcow2.h
744
+++ b/softmmu/vl.c
21
@@ -XXX,XX +XXX,XX @@ typedef struct BDRVQcow2State {
745
@@ -XXX,XX +XXX,XX @@ static bool object_create_initial(const char *type, QemuOpts *opts)
22
char *image_backing_format;
23
char *image_data_file;
24
25
- CoQueue compress_wait_queue;
26
- int nb_compress_threads;
27
+ CoQueue thread_task_queue;
28
+ int nb_threads;
29
30
BdrvChild *data_file;
31
} BDRVQcow2State;
32
diff --git a/block/qcow2-threads.c b/block/qcow2-threads.c
33
index XXXXXXX..XXXXXXX 100644
34
--- a/block/qcow2-threads.c
35
+++ b/block/qcow2-threads.c
36
@@ -XXX,XX +XXX,XX @@
37
#include "qcow2.h"
38
#include "block/thread-pool.h"
39
40
-#define MAX_COMPRESS_THREADS 4
41
+#define QCOW2_MAX_THREADS 4
42
+
43
+static int coroutine_fn
44
+qcow2_co_process(BlockDriverState *bs, ThreadPoolFunc *func, void *arg)
45
+{
46
+ int ret;
47
+ BDRVQcow2State *s = bs->opaque;
48
+ ThreadPool *pool = aio_get_thread_pool(bdrv_get_aio_context(bs));
49
+
50
+ qemu_co_mutex_lock(&s->lock);
51
+ while (s->nb_threads >= QCOW2_MAX_THREADS) {
52
+ qemu_co_queue_wait(&s->thread_task_queue, &s->lock);
53
+ }
54
+ s->nb_threads++;
55
+ qemu_co_mutex_unlock(&s->lock);
56
+
57
+ ret = thread_pool_submit_co(pool, func, arg);
58
+
59
+ qemu_co_mutex_lock(&s->lock);
60
+ s->nb_threads--;
61
+ qemu_co_queue_next(&s->thread_task_queue);
62
+ qemu_co_mutex_unlock(&s->lock);
63
+
64
+ return ret;
65
+}
66
+
67
+
68
+/*
69
+ * Compression
70
+ */
71
72
typedef ssize_t (*Qcow2CompressFunc)(void *dest, size_t dest_size,
73
const void *src, size_t src_size);
74
@@ -XXX,XX +XXX,XX @@ static ssize_t coroutine_fn
75
qcow2_co_do_compress(BlockDriverState *bs, void *dest, size_t dest_size,
76
const void *src, size_t src_size, Qcow2CompressFunc func)
77
{
78
- BDRVQcow2State *s = bs->opaque;
79
- ThreadPool *pool = aio_get_thread_pool(bdrv_get_aio_context(bs));
80
Qcow2CompressData arg = {
81
.dest = dest,
82
.dest_size = dest_size,
83
@@ -XXX,XX +XXX,XX @@ qcow2_co_do_compress(BlockDriverState *bs, void *dest, size_t dest_size,
84
.func = func,
85
};
86
87
- qemu_co_mutex_lock(&s->lock);
88
- while (s->nb_compress_threads >= MAX_COMPRESS_THREADS) {
89
- qemu_co_queue_wait(&s->compress_wait_queue, &s->lock);
90
- }
91
- s->nb_compress_threads++;
92
- qemu_co_mutex_unlock(&s->lock);
93
-
94
- thread_pool_submit_co(pool, qcow2_compress_pool_func, &arg);
95
-
96
- qemu_co_mutex_lock(&s->lock);
97
- s->nb_compress_threads--;
98
- qemu_co_queue_next(&s->compress_wait_queue);
99
- qemu_co_mutex_unlock(&s->lock);
100
+ qcow2_co_process(bs, qcow2_compress_pool_func, &arg);
101
102
return arg.ret;
103
}
104
diff --git a/block/qcow2.c b/block/qcow2.c
105
index XXXXXXX..XXXXXXX 100644
106
--- a/block/qcow2.c
107
+++ b/block/qcow2.c
108
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn qcow2_do_open(BlockDriverState *bs, QDict *options,
109
}
746
}
110
#endif
747
#endif
111
748
112
- qemu_co_queue_init(&s->compress_wait_queue);
749
+ /* Reason: vhost-user-blk-server property "node-name" */
113
+ qemu_co_queue_init(&s->thread_task_queue);
750
+ if (g_str_equal(type, "vhost-user-blk-server")) {
114
751
+ return false;
115
return ret;
752
+ }
116
753
/*
754
* Reason: filter-* property "netdev" etc.
755
*/
756
diff --git a/block/meson.build b/block/meson.build
757
index XXXXXXX..XXXXXXX 100644
758
--- a/block/meson.build
759
+++ b/block/meson.build
760
@@ -XXX,XX +XXX,XX @@ block_ss.add(when: 'CONFIG_WIN32', if_true: files('file-win32.c', 'win32-aio.c')
761
block_ss.add(when: 'CONFIG_POSIX', if_true: [files('file-posix.c'), coref, iokit])
762
block_ss.add(when: 'CONFIG_LIBISCSI', if_true: files('iscsi-opts.c'))
763
block_ss.add(when: 'CONFIG_LINUX', if_true: files('nvme.c'))
764
+block_ss.add(when: 'CONFIG_LINUX', if_true: files('export/vhost-user-blk-server.c', '../contrib/libvhost-user/libvhost-user.c'))
765
block_ss.add(when: 'CONFIG_REPLICATION', if_true: files('replication.c'))
766
block_ss.add(when: 'CONFIG_SHEEPDOG', if_true: files('sheepdog.c'))
767
block_ss.add(when: ['CONFIG_LINUX_AIO', libaio], if_true: files('linux-aio.c'))
117
--
768
--
118
2.21.0
769
2.26.2
119
770
120
diff view generated by jsdifflib
New patch
1
From: Coiby Xu <coiby.xu@gmail.com>
1
2
3
Suggested-by: Stefano Garzarella <sgarzare@redhat.com>
4
Signed-off-by: Coiby Xu <coiby.xu@gmail.com>
5
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
6
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
7
Message-id: 20200918080912.321299-8-coiby.xu@gmail.com
8
[Removed reference to vhost-user-blk-test.c, it will be sent in a
9
separate pull request.
10
--Stefan]
11
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
12
---
13
MAINTAINERS | 7 +++++++
14
1 file changed, 7 insertions(+)
15
16
diff --git a/MAINTAINERS b/MAINTAINERS
17
index XXXXXXX..XXXXXXX 100644
18
--- a/MAINTAINERS
19
+++ b/MAINTAINERS
20
@@ -XXX,XX +XXX,XX @@ L: qemu-block@nongnu.org
21
S: Supported
22
F: tests/image-fuzzer/
23
24
+Vhost-user block device backend server
25
+M: Coiby Xu <Coiby.Xu@gmail.com>
26
+S: Maintained
27
+F: block/export/vhost-user-blk-server.c
28
+F: util/vhost-user-server.c
29
+F: tests/qtest/libqos/vhost-user-blk.c
30
+
31
Replication
32
M: Wen Congyang <wencongyang2@huawei.com>
33
M: Xie Changlong <xiechanglong.d@gmail.com>
34
--
35
2.26.2
36
diff view generated by jsdifflib
1
From: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
1
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2
2
Message-id: 20200924151549.913737-3-stefanha@redhat.com
3
Valgrind detects multiple issues in QEMU iotests when the memory is
3
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
4
used without being initialized. Valgrind may dump lots of unnecessary
5
reports what makes the memory issue analysis harder. Particularly,
6
that is true for the aligned bitmap directory and can be seen while
7
running the iotest #169. Padding the aligned space with zeros eases
8
the pain.
9
10
Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
11
Message-id: 1558961521-131620-1-git-send-email-andrey.shinkevich@virtuozzo.com
12
Signed-off-by: Max Reitz <mreitz@redhat.com>
13
---
4
---
14
block/qcow2-bitmap.c | 2 +-
5
util/vhost-user-server.c | 2 +-
15
1 file changed, 1 insertion(+), 1 deletion(-)
6
1 file changed, 1 insertion(+), 1 deletion(-)
16
7
17
diff --git a/block/qcow2-bitmap.c b/block/qcow2-bitmap.c
8
diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
18
index XXXXXXX..XXXXXXX 100644
9
index XXXXXXX..XXXXXXX 100644
19
--- a/block/qcow2-bitmap.c
10
--- a/util/vhost-user-server.c
20
+++ b/block/qcow2-bitmap.c
11
+++ b/util/vhost-user-server.c
21
@@ -XXX,XX +XXX,XX @@ static int bitmap_list_store(BlockDriverState *bs, Qcow2BitmapList *bm_list,
12
@@ -XXX,XX +XXX,XX @@ bool vhost_user_server_start(VuServer *server,
22
dir_offset = *offset;
13
return false;
23
}
14
}
24
15
25
- dir = g_try_malloc(dir_size);
16
- /* zero out unspecified fileds */
26
+ dir = g_try_malloc0(dir_size);
17
+ /* zero out unspecified fields */
27
if (dir == NULL) {
18
*server = (VuServer) {
28
return -ENOMEM;
19
.listener = listener,
29
}
20
.vu_iface = vu_iface,
30
--
21
--
31
2.21.0
22
2.26.2
32
23
33
diff view generated by jsdifflib
New patch
1
We already have access to the value with the correct type (ioc and sioc
2
are the same QIOChannel).
1
3
4
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
5
Message-id: 20200924151549.913737-4-stefanha@redhat.com
6
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
7
---
8
util/vhost-user-server.c | 2 +-
9
1 file changed, 1 insertion(+), 1 deletion(-)
10
11
diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
12
index XXXXXXX..XXXXXXX 100644
13
--- a/util/vhost-user-server.c
14
+++ b/util/vhost-user-server.c
15
@@ -XXX,XX +XXX,XX @@ static void vu_accept(QIONetListener *listener, QIOChannelSocket *sioc,
16
server->ioc = QIO_CHANNEL(sioc);
17
object_ref(OBJECT(server->ioc));
18
qio_channel_attach_aio_context(server->ioc, server->ctx);
19
- qio_channel_set_blocking(QIO_CHANNEL(server->sioc), false, NULL);
20
+ qio_channel_set_blocking(server->ioc, false, NULL);
21
vu_client_start(server);
22
}
23
24
--
25
2.26.2
26
diff view generated by jsdifflib
1
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
1
Explicitly deleting watches is not necessary since libvhost-user calls
2
remove_watch() during vu_deinit(). Add an assertion to check this
3
though.
2
4
3
Do full, top and incremental mode copying all in one place. This
5
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
4
unifies the code path and helps further improvements.
6
Message-id: 20200924151549.913737-5-stefanha@redhat.com
7
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
8
---
9
util/vhost-user-server.c | 19 ++++---------------
10
1 file changed, 4 insertions(+), 15 deletions(-)
5
11
6
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
12
diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
7
Reviewed-by: Max Reitz <mreitz@redhat.com>
8
Message-id: 20190429090842.57910-5-vsementsov@virtuozzo.com
9
Signed-off-by: Max Reitz <mreitz@redhat.com>
10
---
11
block/backup.c | 43 ++++++++++---------------------------------
12
1 file changed, 10 insertions(+), 33 deletions(-)
13
14
diff --git a/block/backup.c b/block/backup.c
15
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
16
--- a/block/backup.c
14
--- a/util/vhost-user-server.c
17
+++ b/block/backup.c
15
+++ b/util/vhost-user-server.c
18
@@ -XXX,XX +XXX,XX @@ static bool bdrv_is_unallocated_range(BlockDriverState *bs,
16
@@ -XXX,XX +XXX,XX @@ static void close_client(VuServer *server)
19
return offset >= end;
17
/* When this is set vu_client_trip will stop new processing vhost-user message */
20
}
18
server->sioc = NULL;
21
19
22
-static int coroutine_fn backup_run_incremental(BackupBlockJob *job)
20
- VuFdWatch *vu_fd_watch, *next;
23
+static int coroutine_fn backup_loop(BackupBlockJob *job)
21
- QTAILQ_FOREACH_SAFE(vu_fd_watch, &server->vu_fd_watches, next, next) {
24
{
22
- aio_set_fd_handler(server->ioc->ctx, vu_fd_watch->fd, true, NULL,
25
int ret;
23
- NULL, NULL, NULL);
26
bool error_is_read;
24
- }
27
int64_t offset;
28
HBitmapIter hbi;
29
+ BlockDriverState *bs = blk_bs(job->common.blk);
30
31
hbitmap_iter_init(&hbi, job->copy_bitmap, 0);
32
while ((offset = hbitmap_iter_next(&hbi)) != -1) {
33
+ if (job->sync_mode == MIRROR_SYNC_MODE_TOP &&
34
+ bdrv_is_unallocated_range(bs, offset, job->cluster_size))
35
+ {
36
+ hbitmap_reset(job->copy_bitmap, offset, job->cluster_size);
37
+ continue;
38
+ }
39
+
40
do {
41
if (yield_and_check(job)) {
42
return 0;
43
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn backup_run(Job *job, Error **errp)
44
{
45
BackupBlockJob *s = container_of(job, BackupBlockJob, common.job);
46
BlockDriverState *bs = blk_bs(s->common.blk);
47
- int64_t offset;
48
int ret = 0;
49
50
QLIST_INIT(&s->inflight_reqs);
51
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn backup_run(Job *job, Error **errp)
52
* notify callback service CoW requests. */
53
job_yield(job);
54
}
55
- } else if (s->sync_mode == MIRROR_SYNC_MODE_INCREMENTAL) {
56
- ret = backup_run_incremental(s);
57
} else {
58
- /* Both FULL and TOP SYNC_MODE's require copying.. */
59
- for (offset = 0; offset < s->len;
60
- offset += s->cluster_size) {
61
- bool error_is_read;
62
-
25
-
63
- if (yield_and_check(s)) {
26
- while (!QTAILQ_EMPTY(&server->vu_fd_watches)) {
64
- break;
27
- QTAILQ_FOREACH_SAFE(vu_fd_watch, &server->vu_fd_watches, next, next) {
65
- }
28
- if (!vu_fd_watch->processing) {
66
-
29
- QTAILQ_REMOVE(&server->vu_fd_watches, vu_fd_watch, next);
67
- if (s->sync_mode == MIRROR_SYNC_MODE_TOP &&
30
- g_free(vu_fd_watch);
68
- bdrv_is_unallocated_range(bs, offset, s->cluster_size))
69
- {
70
- continue;
71
- }
72
-
73
- ret = backup_do_cow(s, offset, s->cluster_size,
74
- &error_is_read, false);
75
- if (ret < 0) {
76
- /* Depending on error action, fail now or retry cluster */
77
- BlockErrorAction action =
78
- backup_error_action(s, error_is_read, -ret);
79
- if (action == BLOCK_ERROR_ACTION_REPORT) {
80
- break;
81
- } else {
82
- offset -= s->cluster_size;
83
- continue;
84
- }
85
- }
31
- }
86
- }
32
- }
87
+ ret = backup_loop(s);
33
- }
34
-
35
while (server->processing_msg) {
36
if (server->ioc->read_coroutine) {
37
server->ioc->read_coroutine = NULL;
38
@@ -XXX,XX +XXX,XX @@ static void close_client(VuServer *server)
88
}
39
}
89
40
90
notifier_with_return_remove(&s->before_write);
41
vu_deinit(&server->vu_dev);
42
+
43
+ /* vu_deinit() should have called remove_watch() */
44
+ assert(QTAILQ_EMPTY(&server->vu_fd_watches));
45
+
46
object_unref(OBJECT(sioc));
47
object_unref(OBJECT(server->ioc));
48
}
91
--
49
--
92
2.21.0
50
2.26.2
93
51
94
diff view generated by jsdifflib
1
From: Alberto Garcia <berto@igalia.com>
1
Only one struct is needed per request. Drop req_data and the separate
2
VuBlockReq instance. Instead let vu_queue_pop() allocate everything at
3
once.
2
4
3
bdrv_unref_child() does the following things:
5
This fixes the req_data memory leak in vu_block_virtio_process_req().
4
6
5
- Updates the child->bs->inherits_from pointer.
7
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
6
- Calls bdrv_detach_child() to remove the BdrvChild from bs->children.
8
Message-id: 20200924151549.913737-6-stefanha@redhat.com
7
- Calls bdrv_unref() to unref the child BlockDriverState.
9
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10
---
11
block/export/vhost-user-blk-server.c | 68 +++++++++-------------------
12
1 file changed, 21 insertions(+), 47 deletions(-)
8
13
9
When bdrv_unref_child() was introduced in commit 33a604075c it was not
14
diff --git a/block/export/vhost-user-blk-server.c b/block/export/vhost-user-blk-server.c
10
used in bdrv_close() because the drivers that had additional children
11
(like quorum or blkverify) had already called bdrv_unref() on their
12
children during their own close functions.
13
14
This was changed later (in 0bd6e91a7e for quorum, in 3e586be0b2 for
15
blkverify) so there's no reason not to use bdrv_unref_child() in
16
bdrv_close() anymore.
17
18
After this there's also no need to remove bs->backing and bs->file
19
separately from the rest of the children, so bdrv_close() can be
20
simplified.
21
22
Now bdrv_close() unrefs all children (before this patch it was only
23
bs->file and bs->backing). As a result, none of the callers of
24
brvd_attach_child() should remove their reference to child_bs (because
25
this function effectively steals that reference). This patch updates a
26
couple of tests that were doing their own bdrv_unref().
27
28
Signed-off-by: Alberto Garcia <berto@igalia.com>
29
Message-id: 6d1d5feaa53aa1ab127adb73d605dc4503e3abd5.1557754872.git.berto@igalia.com
30
[mreitz: s/where/were/]
31
Signed-off-by: Max Reitz <mreitz@redhat.com>
32
---
33
block.c | 16 +++-------------
34
tests/test-bdrv-drain.c | 6 ------
35
tests/test-bdrv-graph-mod.c | 1 -
36
3 files changed, 3 insertions(+), 20 deletions(-)
37
38
diff --git a/block.c b/block.c
39
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
40
--- a/block.c
16
--- a/block/export/vhost-user-blk-server.c
41
+++ b/block.c
17
+++ b/block/export/vhost-user-blk-server.c
42
@@ -XXX,XX +XXX,XX @@ static void bdrv_close(BlockDriverState *bs)
18
@@ -XXX,XX +XXX,XX @@ struct virtio_blk_inhdr {
43
bs->drv = NULL;
19
};
44
}
20
45
21
typedef struct VuBlockReq {
46
- bdrv_set_backing_hd(bs, NULL, &error_abort);
22
- VuVirtqElement *elem;
47
-
23
+ VuVirtqElement elem;
48
- if (bs->file != NULL) {
24
int64_t sector_num;
49
- bdrv_unref_child(bs, bs->file);
25
size_t size;
50
- bs->file = NULL;
26
struct virtio_blk_inhdr *in;
27
@@ -XXX,XX +XXX,XX @@ static void vu_block_req_complete(VuBlockReq *req)
28
VuDev *vu_dev = &req->server->vu_dev;
29
30
/* IO size with 1 extra status byte */
31
- vu_queue_push(vu_dev, req->vq, req->elem, req->size + 1);
32
+ vu_queue_push(vu_dev, req->vq, &req->elem, req->size + 1);
33
vu_queue_notify(vu_dev, req->vq);
34
35
- if (req->elem) {
36
- free(req->elem);
51
- }
37
- }
52
-
38
-
53
QLIST_FOREACH_SAFE(child, &bs->children, next, next) {
39
- g_free(req);
54
- /* TODO Remove bdrv_unref() from drivers' close function and use
40
+ free(req);
55
- * bdrv_unref_child() here */
41
}
56
- if (child->bs->inherits_from == bs) {
42
57
- child->bs->inherits_from = NULL;
43
static VuBlockDev *get_vu_block_device_by_server(VuServer *server)
58
- }
44
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn vu_block_flush(VuBlockReq *req)
59
- bdrv_detach_child(child);
45
blk_co_flush(backend);
60
+ bdrv_unref_child(bs, child);
46
}
47
48
-struct req_data {
49
- VuServer *server;
50
- VuVirtq *vq;
51
- VuVirtqElement *elem;
52
-};
53
-
54
static void coroutine_fn vu_block_virtio_process_req(void *opaque)
55
{
56
- struct req_data *data = opaque;
57
- VuServer *server = data->server;
58
- VuVirtq *vq = data->vq;
59
- VuVirtqElement *elem = data->elem;
60
+ VuBlockReq *req = opaque;
61
+ VuServer *server = req->server;
62
+ VuVirtqElement *elem = &req->elem;
63
uint32_t type;
64
- VuBlockReq *req;
65
66
VuBlockDev *vdev_blk = get_vu_block_device_by_server(server);
67
BlockBackend *backend = vdev_blk->backend;
68
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn vu_block_virtio_process_req(void *opaque)
69
struct iovec *out_iov = elem->out_sg;
70
unsigned in_num = elem->in_num;
71
unsigned out_num = elem->out_num;
72
+
73
/* refer to hw/block/virtio_blk.c */
74
if (elem->out_num < 1 || elem->in_num < 1) {
75
error_report("virtio-blk request missing headers");
76
- free(elem);
77
- return;
78
+ goto err;
61
}
79
}
62
80
63
+ bs->backing = NULL;
81
- req = g_new0(VuBlockReq, 1);
64
+ bs->file = NULL;
82
- req->server = server;
65
g_free(bs->opaque);
83
- req->vq = vq;
66
bs->opaque = NULL;
84
- req->elem = elem;
67
atomic_set(&bs->copy_on_read, 0);
68
diff --git a/tests/test-bdrv-drain.c b/tests/test-bdrv-drain.c
69
index XXXXXXX..XXXXXXX 100644
70
--- a/tests/test-bdrv-drain.c
71
+++ b/tests/test-bdrv-drain.c
72
@@ -XXX,XX +XXX,XX @@ static void test_detach_indirect(bool by_parent_cb)
73
bdrv_unref(parent_b);
74
blk_unref(blk);
75
76
- /* XXX Once bdrv_close() unref's children instead of just detaching them,
77
- * this won't be necessary any more. */
78
- bdrv_unref(a);
79
- bdrv_unref(a);
80
- bdrv_unref(c);
81
-
85
-
82
g_assert_cmpint(a->refcnt, ==, 1);
86
if (unlikely(iov_to_buf(out_iov, out_num, 0, &req->out,
83
g_assert_cmpint(b->refcnt, ==, 1);
87
sizeof(req->out)) != sizeof(req->out))) {
84
g_assert_cmpint(c->refcnt, ==, 1);
88
error_report("virtio-blk request outhdr too short");
85
diff --git a/tests/test-bdrv-graph-mod.c b/tests/test-bdrv-graph-mod.c
89
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn vu_block_virtio_process_req(void *opaque)
86
index XXXXXXX..XXXXXXX 100644
90
87
--- a/tests/test-bdrv-graph-mod.c
91
err:
88
+++ b/tests/test-bdrv-graph-mod.c
92
free(elem);
89
@@ -XXX,XX +XXX,XX @@ static void test_update_perm_tree(void)
93
- g_free(req);
90
g_assert_nonnull(local_err);
94
- return;
91
error_free(local_err);
92
93
- bdrv_unref(bs);
94
blk_unref(root);
95
}
95
}
96
96
97
static void vu_block_process_vq(VuDev *vu_dev, int idx)
98
{
99
- VuServer *server;
100
- VuVirtq *vq;
101
- struct req_data *req_data;
102
+ VuServer *server = container_of(vu_dev, VuServer, vu_dev);
103
+ VuVirtq *vq = vu_get_queue(vu_dev, idx);
104
105
- server = container_of(vu_dev, VuServer, vu_dev);
106
- assert(server);
107
-
108
- vq = vu_get_queue(vu_dev, idx);
109
- assert(vq);
110
- VuVirtqElement *elem;
111
while (1) {
112
- elem = vu_queue_pop(vu_dev, vq, sizeof(VuVirtqElement) +
113
- sizeof(VuBlockReq));
114
- if (elem) {
115
- req_data = g_new0(struct req_data, 1);
116
- req_data->server = server;
117
- req_data->vq = vq;
118
- req_data->elem = elem;
119
- Coroutine *co = qemu_coroutine_create(vu_block_virtio_process_req,
120
- req_data);
121
- aio_co_enter(server->ioc->ctx, co);
122
- } else {
123
+ VuBlockReq *req;
124
+
125
+ req = vu_queue_pop(vu_dev, vq, sizeof(VuBlockReq));
126
+ if (!req) {
127
break;
128
}
129
+
130
+ req->server = server;
131
+ req->vq = vq;
132
+
133
+ Coroutine *co =
134
+ qemu_coroutine_create(vu_block_virtio_process_req, req);
135
+ qemu_coroutine_enter(co);
136
}
137
}
138
97
--
139
--
98
2.21.0
140
2.26.2
99
141
100
diff view generated by jsdifflib
New patch
1
The device panic notifier callback is not used. Drop it.
1
2
3
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
4
Message-id: 20200924151549.913737-7-stefanha@redhat.com
5
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
6
---
7
util/vhost-user-server.h | 3 ---
8
block/export/vhost-user-blk-server.c | 3 +--
9
util/vhost-user-server.c | 6 ------
10
3 files changed, 1 insertion(+), 11 deletions(-)
11
12
diff --git a/util/vhost-user-server.h b/util/vhost-user-server.h
13
index XXXXXXX..XXXXXXX 100644
14
--- a/util/vhost-user-server.h
15
+++ b/util/vhost-user-server.h
16
@@ -XXX,XX +XXX,XX @@ typedef struct VuFdWatch {
17
} VuFdWatch;
18
19
typedef struct VuServer VuServer;
20
-typedef void DevicePanicNotifierFn(VuServer *server);
21
22
struct VuServer {
23
QIONetListener *listener;
24
AioContext *ctx;
25
- DevicePanicNotifierFn *device_panic_notifier;
26
int max_queues;
27
const VuDevIface *vu_iface;
28
VuDev vu_dev;
29
@@ -XXX,XX +XXX,XX @@ bool vhost_user_server_start(VuServer *server,
30
SocketAddress *unix_socket,
31
AioContext *ctx,
32
uint16_t max_queues,
33
- DevicePanicNotifierFn *device_panic_notifier,
34
const VuDevIface *vu_iface,
35
Error **errp);
36
37
diff --git a/block/export/vhost-user-blk-server.c b/block/export/vhost-user-blk-server.c
38
index XXXXXXX..XXXXXXX 100644
39
--- a/block/export/vhost-user-blk-server.c
40
+++ b/block/export/vhost-user-blk-server.c
41
@@ -XXX,XX +XXX,XX @@ static void vhost_user_blk_server_start(VuBlockDev *vu_block_device,
42
ctx = bdrv_get_aio_context(blk_bs(vu_block_device->backend));
43
44
if (!vhost_user_server_start(&vu_block_device->vu_server, addr, ctx,
45
- VHOST_USER_BLK_MAX_QUEUES,
46
- NULL, &vu_block_iface,
47
+ VHOST_USER_BLK_MAX_QUEUES, &vu_block_iface,
48
errp)) {
49
goto error;
50
}
51
diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
52
index XXXXXXX..XXXXXXX 100644
53
--- a/util/vhost-user-server.c
54
+++ b/util/vhost-user-server.c
55
@@ -XXX,XX +XXX,XX @@ static void panic_cb(VuDev *vu_dev, const char *buf)
56
close_client(server);
57
}
58
59
- if (server->device_panic_notifier) {
60
- server->device_panic_notifier(server);
61
- }
62
-
63
/*
64
* Set the callback function for network listener so another
65
* vhost-user client can connect to this server
66
@@ -XXX,XX +XXX,XX @@ bool vhost_user_server_start(VuServer *server,
67
SocketAddress *socket_addr,
68
AioContext *ctx,
69
uint16_t max_queues,
70
- DevicePanicNotifierFn *device_panic_notifier,
71
const VuDevIface *vu_iface,
72
Error **errp)
73
{
74
@@ -XXX,XX +XXX,XX @@ bool vhost_user_server_start(VuServer *server,
75
.vu_iface = vu_iface,
76
.max_queues = max_queues,
77
.ctx = ctx,
78
- .device_panic_notifier = device_panic_notifier,
79
};
80
81
qio_net_listener_set_name(server->listener, "vhost-user-backend-listener");
82
--
83
2.26.2
84
diff view generated by jsdifflib
1
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
1
fds[] is leaked when qio_channel_readv_full() fails.
2
2
3
Background: decryption will be done in threads, to take benefit of it,
3
Use vmsg->fds[] instead of keeping a local fds[] array. Then we can
4
we should move it out of the lock first.
4
reuse goto fail to clean up fds. vmsg->fd_num must be zeroed before the
5
loop to make this safe.
5
6
6
But let's go further: it turns out, that only
7
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
7
qcow2_get_cluster_offset() needs locking, so reduce locking to it.
8
Message-id: 20200924151549.913737-8-stefanha@redhat.com
9
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10
---
11
util/vhost-user-server.c | 50 ++++++++++++++++++----------------------
12
1 file changed, 23 insertions(+), 27 deletions(-)
8
13
9
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
14
diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
10
Message-id: 20190506142741.41731-7-vsementsov@virtuozzo.com
11
Reviewed-by: Alberto Garcia <berto@igalia.com>
12
Signed-off-by: Max Reitz <mreitz@redhat.com>
13
---
14
block/qcow2.c | 12 ++----------
15
1 file changed, 2 insertions(+), 10 deletions(-)
16
17
diff --git a/block/qcow2.c b/block/qcow2.c
18
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
19
--- a/block/qcow2.c
16
--- a/util/vhost-user-server.c
20
+++ b/block/qcow2.c
17
+++ b/util/vhost-user-server.c
21
@@ -XXX,XX +XXX,XX @@ static coroutine_fn int qcow2_co_preadv(BlockDriverState *bs, uint64_t offset,
18
@@ -XXX,XX +XXX,XX @@ vu_message_read(VuDev *vu_dev, int conn_fd, VhostUserMsg *vmsg)
22
19
};
23
qemu_iovec_init(&hd_qiov, qiov->niov);
20
int rc, read_bytes = 0;
24
21
Error *local_err = NULL;
25
- qemu_co_mutex_lock(&s->lock);
22
- /*
26
-
23
- * Store fds/nfds returned from qio_channel_readv_full into
27
while (bytes != 0) {
24
- * temporary variables.
28
25
- *
29
/* prepare next request */
26
- * VhostUserMsg is a packed structure, gcc will complain about passing
30
@@ -XXX,XX +XXX,XX @@ static coroutine_fn int qcow2_co_preadv(BlockDriverState *bs, uint64_t offset,
27
- * pointer to a packed structure member if we pass &VhostUserMsg.fd_num
31
QCOW_MAX_CRYPT_CLUSTERS * s->cluster_size);
28
- * and &VhostUserMsg.fds directly when calling qio_channel_readv_full,
29
- * thus two temporary variables nfds and fds are used here.
30
- */
31
- size_t nfds = 0, nfds_t = 0;
32
const size_t max_fds = G_N_ELEMENTS(vmsg->fds);
33
- int *fds_t = NULL;
34
VuServer *server = container_of(vu_dev, VuServer, vu_dev);
35
QIOChannel *ioc = server->ioc;
36
37
+ vmsg->fd_num = 0;
38
if (!ioc) {
39
error_report_err(local_err);
40
goto fail;
41
@@ -XXX,XX +XXX,XX @@ vu_message_read(VuDev *vu_dev, int conn_fd, VhostUserMsg *vmsg)
42
43
assert(qemu_in_coroutine());
44
do {
45
+ size_t nfds = 0;
46
+ int *fds = NULL;
47
+
48
/*
49
* qio_channel_readv_full may have short reads, keeping calling it
50
* until getting VHOST_USER_HDR_SIZE or 0 bytes in total
51
*/
52
- rc = qio_channel_readv_full(ioc, &iov, 1, &fds_t, &nfds_t, &local_err);
53
+ rc = qio_channel_readv_full(ioc, &iov, 1, &fds, &nfds, &local_err);
54
if (rc < 0) {
55
if (rc == QIO_CHANNEL_ERR_BLOCK) {
56
+ assert(local_err == NULL);
57
qio_channel_yield(ioc, G_IO_IN);
58
continue;
59
} else {
60
error_report_err(local_err);
61
- return false;
62
+ goto fail;
63
}
32
}
64
}
33
65
- read_bytes += rc;
34
+ qemu_co_mutex_lock(&s->lock);
66
- if (nfds_t > 0) {
35
ret = qcow2_get_cluster_offset(bs, offset, &cur_bytes, &cluster_offset);
67
- if (nfds + nfds_t > max_fds) {
36
+ qemu_co_mutex_unlock(&s->lock);
68
+
37
if (ret < 0) {
69
+ if (nfds > 0) {
38
goto fail;
70
+ if (vmsg->fd_num + nfds > max_fds) {
39
}
71
error_report("A maximum of %zu fds are allowed, "
40
@@ -XXX,XX +XXX,XX @@ static coroutine_fn int qcow2_co_preadv(BlockDriverState *bs, uint64_t offset,
72
"however got %zu fds now",
41
73
- max_fds, nfds + nfds_t);
42
if (bs->backing) {
74
+ max_fds, vmsg->fd_num + nfds);
43
BLKDBG_EVENT(bs->file, BLKDBG_READ_BACKING_AIO);
75
+ g_free(fds);
44
- qemu_co_mutex_unlock(&s->lock);
45
ret = bdrv_co_preadv(bs->backing, offset, cur_bytes,
46
&hd_qiov, 0);
47
- qemu_co_mutex_lock(&s->lock);
48
if (ret < 0) {
49
goto fail;
50
}
51
@@ -XXX,XX +XXX,XX @@ static coroutine_fn int qcow2_co_preadv(BlockDriverState *bs, uint64_t offset,
52
break;
53
54
case QCOW2_CLUSTER_COMPRESSED:
55
- qemu_co_mutex_unlock(&s->lock);
56
ret = qcow2_co_preadv_compressed(bs, cluster_offset,
57
offset, cur_bytes,
58
&hd_qiov);
59
- qemu_co_mutex_lock(&s->lock);
60
if (ret < 0) {
61
goto fail;
76
goto fail;
62
}
77
}
63
@@ -XXX,XX +XXX,XX @@ static coroutine_fn int qcow2_co_preadv(BlockDriverState *bs, uint64_t offset,
78
- memcpy(vmsg->fds + nfds, fds_t,
64
}
79
- nfds_t *sizeof(vmsg->fds[0]));
65
80
- nfds += nfds_t;
66
BLKDBG_EVENT(bs->file, BLKDBG_READ_AIO);
81
- g_free(fds_t);
67
- qemu_co_mutex_unlock(&s->lock);
82
+ memcpy(vmsg->fds + vmsg->fd_num, fds, nfds * sizeof(vmsg->fds[0]));
68
ret = bdrv_co_preadv(s->data_file,
83
+ vmsg->fd_num += nfds;
69
cluster_offset + offset_in_cluster,
84
+ g_free(fds);
70
cur_bytes, &hd_qiov, 0);
85
}
71
- qemu_co_mutex_lock(&s->lock);
86
- if (read_bytes == VHOST_USER_HDR_SIZE || rc == 0) {
72
if (ret < 0) {
87
- break;
73
goto fail;
88
+
74
}
89
+ if (rc == 0) { /* socket closed */
75
@@ -XXX,XX +XXX,XX @@ static coroutine_fn int qcow2_co_preadv(BlockDriverState *bs, uint64_t offset,
90
+ goto fail;
76
ret = 0;
91
}
77
92
- iov.iov_base = (char *)vmsg + read_bytes;
78
fail:
93
- iov.iov_len = VHOST_USER_HDR_SIZE - read_bytes;
79
- qemu_co_mutex_unlock(&s->lock);
94
- } while (true);
80
-
95
81
qemu_iovec_destroy(&hd_qiov);
96
- vmsg->fd_num = nfds;
82
qemu_vfree(cluster_data);
97
+ iov.iov_base += rc;
83
98
+ iov.iov_len -= rc;
99
+ read_bytes += rc;
100
+ } while (read_bytes != VHOST_USER_HDR_SIZE);
101
+
102
/* qio_channel_readv_full will make socket fds blocking, unblock them */
103
vmsg_unblock_fds(vmsg);
104
if (vmsg->size > sizeof(vmsg->payload)) {
84
--
105
--
85
2.21.0
106
2.26.2
86
107
87
diff view generated by jsdifflib
1
From: Sam Eiderman <shmuel.eiderman@oracle.com>
1
Unexpected EOF is an error that must be reported.
2
2
3
If a chain was detected, don't open a new BlockBackend from the target
3
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
4
backing file which will create a new BlockDriverState. Instead, create
4
Message-id: 20200924151549.913737-9-stefanha@redhat.com
5
an empty BlockBackend and attach the already open BlockDriverState.
5
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
6
---
7
util/vhost-user-server.c | 6 ++++--
8
1 file changed, 4 insertions(+), 2 deletions(-)
6
9
7
Permissions for blk_new() were copied from blk_new_open() when
10
diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
8
flags = 0.
9
10
Reviewed-by: Karl Heubaum <karl.heubaum@oracle.com>
11
Reviewed-by: Eyal Moscovici <eyal.moscovici@oracle.com>
12
Signed-off-by: Sagi Amit <sagi.amit@oracle.com>
13
Co-developed-by: Sagi Amit <sagi.amit@oracle.com>
14
Signed-off-by: Sam Eiderman <shmuel.eiderman@oracle.com>
15
Message-id: 20190523163337.4497-4-shmuel.eiderman@oracle.com
16
Signed-off-by: Max Reitz <mreitz@redhat.com>
17
---
18
qemu-img.c | 33 +++++++++++++++++++++++----------
19
1 file changed, 23 insertions(+), 10 deletions(-)
20
21
diff --git a/qemu-img.c b/qemu-img.c
22
index XXXXXXX..XXXXXXX 100644
11
index XXXXXXX..XXXXXXX 100644
23
--- a/qemu-img.c
12
--- a/util/vhost-user-server.c
24
+++ b/qemu-img.c
13
+++ b/util/vhost-user-server.c
25
@@ -XXX,XX +XXX,XX @@ static int img_rebase(int argc, char **argv)
14
@@ -XXX,XX +XXX,XX @@ vu_message_read(VuDev *vu_dev, int conn_fd, VhostUserMsg *vmsg)
26
* in its chain.
15
};
27
*/
16
if (vmsg->size) {
28
prefix_chain_bs = bdrv_find_backing_image(bs, out_real_path);
17
rc = qio_channel_readv_all_eof(ioc, &iov_payload, 1, &local_err);
29
-
18
- if (rc == -1) {
30
- blk_new_backing = blk_new_open(out_real_path, NULL,
19
- error_report_err(local_err);
31
- options, src_flags, &local_err);
20
+ if (rc != 1) {
32
- g_free(out_real_path);
21
+ if (local_err) {
33
- if (!blk_new_backing) {
22
+ error_report_err(local_err);
34
- error_reportf_err(local_err,
23
+ }
35
- "Could not open new backing file '%s': ",
24
goto fail;
36
- out_baseimg);
37
- ret = -1;
38
- goto out;
39
+ if (prefix_chain_bs) {
40
+ g_free(out_real_path);
41
+ blk_new_backing = blk_new(BLK_PERM_CONSISTENT_READ,
42
+ BLK_PERM_ALL);
43
+ ret = blk_insert_bs(blk_new_backing, prefix_chain_bs,
44
+ &local_err);
45
+ if (ret < 0) {
46
+ error_reportf_err(local_err,
47
+ "Could not reuse backing file '%s': ",
48
+ out_baseimg);
49
+ goto out;
50
+ }
51
+ } else {
52
+ blk_new_backing = blk_new_open(out_real_path, NULL,
53
+ options, src_flags, &local_err);
54
+ g_free(out_real_path);
55
+ if (!blk_new_backing) {
56
+ error_reportf_err(local_err,
57
+ "Could not open new backing file '%s': ",
58
+ out_baseimg);
59
+ ret = -1;
60
+ goto out;
61
+ }
62
}
63
}
25
}
64
}
26
}
65
--
27
--
66
2.21.0
28
2.26.2
67
29
68
diff view generated by jsdifflib
1
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
1
The vu_client_trip() coroutine is leaked during AioContext switching. It
2
is also unsafe to destroy the vu_dev in panic_cb() since its callers
3
still access it in some cases.
2
4
3
Do encryption/decryption in threads, like it is already done for
5
Rework the lifecycle to solve these safety issues.
4
compression. This improves asynchronous encrypted io.
5
6
6
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
7
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
7
Reviewed-by: Alberto Garcia <berto@igalia.com>
8
Message-id: 20200924151549.913737-10-stefanha@redhat.com
8
Reviewed-by: Max Reitz <mreitz@redhat.com>
9
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
9
Message-id: 20190506142741.41731-9-vsementsov@virtuozzo.com
10
Signed-off-by: Max Reitz <mreitz@redhat.com>
11
---
10
---
12
block/qcow2.h | 8 ++++++
11
util/vhost-user-server.h | 29 ++--
13
block/qcow2-cluster.c | 7 ++---
12
block/export/vhost-user-blk-server.c | 9 +-
14
block/qcow2-threads.c | 65 +++++++++++++++++++++++++++++++++++++++++--
13
util/vhost-user-server.c | 245 +++++++++++++++------------
15
block/qcow2.c | 22 +++++----------
14
3 files changed, 155 insertions(+), 128 deletions(-)
16
4 files changed, 81 insertions(+), 21 deletions(-)
17
15
18
diff --git a/block/qcow2.h b/block/qcow2.h
16
diff --git a/util/vhost-user-server.h b/util/vhost-user-server.h
19
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
20
--- a/block/qcow2.h
18
--- a/util/vhost-user-server.h
21
+++ b/block/qcow2.h
19
+++ b/util/vhost-user-server.h
22
@@ -XXX,XX +XXX,XX @@ typedef struct Qcow2BitmapHeaderExt {
20
@@ -XXX,XX +XXX,XX @@
23
uint64_t bitmap_directory_offset;
21
#include "qapi/error.h"
24
} QEMU_PACKED Qcow2BitmapHeaderExt;
22
#include "standard-headers/linux/virtio_blk.h"
25
23
26
+#define QCOW2_MAX_THREADS 4
24
+/* A kick fd that we monitor on behalf of libvhost-user */
27
+
25
typedef struct VuFdWatch {
28
typedef struct BDRVQcow2State {
26
VuDev *vu_dev;
29
int cluster_bits;
27
int fd; /*kick fd*/
30
int cluster_size;
28
void *pvt;
31
@@ -XXX,XX +XXX,XX @@ qcow2_co_compress(BlockDriverState *bs, void *dest, size_t dest_size,
29
vu_watch_cb cb;
32
ssize_t coroutine_fn
30
- bool processing;
33
qcow2_co_decompress(BlockDriverState *bs, void *dest, size_t dest_size,
31
QTAILQ_ENTRY(VuFdWatch) next;
34
const void *src, size_t src_size);
32
} VuFdWatch;
35
+int coroutine_fn
33
36
+qcow2_co_encrypt(BlockDriverState *bs, uint64_t file_cluster_offset,
34
-typedef struct VuServer VuServer;
37
+ uint64_t offset, void *buf, size_t len);
35
-
38
+int coroutine_fn
36
-struct VuServer {
39
+qcow2_co_decrypt(BlockDriverState *bs, uint64_t file_cluster_offset,
37
+/**
40
+ uint64_t offset, void *buf, size_t len);
38
+ * VuServer:
41
39
+ * A vhost-user server instance with user-defined VuDevIface callbacks.
42
#endif
40
+ * Vhost-user device backends can be implemented using VuServer. VuDevIface
43
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
41
+ * callbacks and virtqueue kicks run in the given AioContext.
42
+ */
43
+typedef struct {
44
QIONetListener *listener;
45
+ QEMUBH *restart_listener_bh;
46
AioContext *ctx;
47
int max_queues;
48
const VuDevIface *vu_iface;
49
+
50
+ /* Protected by ctx lock */
51
VuDev vu_dev;
52
QIOChannel *ioc; /* The I/O channel with the client */
53
QIOChannelSocket *sioc; /* The underlying data channel with the client */
54
- /* IOChannel for fd provided via VHOST_USER_SET_SLAVE_REQ_FD */
55
- QIOChannel *ioc_slave;
56
- QIOChannelSocket *sioc_slave;
57
- Coroutine *co_trip; /* coroutine for processing VhostUserMsg */
58
QTAILQ_HEAD(, VuFdWatch) vu_fd_watches;
59
- /* restart coroutine co_trip if AIOContext is changed */
60
- bool aio_context_changed;
61
- bool processing_msg;
62
-};
63
+
64
+ Coroutine *co_trip; /* coroutine for processing VhostUserMsg */
65
+} VuServer;
66
67
bool vhost_user_server_start(VuServer *server,
68
SocketAddress *unix_socket,
69
@@ -XXX,XX +XXX,XX @@ bool vhost_user_server_start(VuServer *server,
70
71
void vhost_user_server_stop(VuServer *server);
72
73
-void vhost_user_server_set_aio_context(VuServer *server, AioContext *ctx);
74
+void vhost_user_server_attach_aio_context(VuServer *server, AioContext *ctx);
75
+void vhost_user_server_detach_aio_context(VuServer *server);
76
77
#endif /* VHOST_USER_SERVER_H */
78
diff --git a/block/export/vhost-user-blk-server.c b/block/export/vhost-user-blk-server.c
44
index XXXXXXX..XXXXXXX 100644
79
index XXXXXXX..XXXXXXX 100644
45
--- a/block/qcow2-cluster.c
80
--- a/block/export/vhost-user-blk-server.c
46
+++ b/block/qcow2-cluster.c
81
+++ b/block/export/vhost-user-blk-server.c
47
@@ -XXX,XX +XXX,XX @@ static bool coroutine_fn do_perform_cow_encrypt(BlockDriverState *bs,
82
@@ -XXX,XX +XXX,XX @@ static const VuDevIface vu_block_iface = {
48
{
83
static void blk_aio_attached(AioContext *ctx, void *opaque)
49
if (bytes && bs->encrypted) {
84
{
50
BDRVQcow2State *s = bs->opaque;
85
VuBlockDev *vub_dev = opaque;
51
- int64_t offset = (s->crypt_physical_offset ?
86
- aio_context_acquire(ctx);
52
- (cluster_offset + offset_in_cluster) :
87
- vhost_user_server_set_aio_context(&vub_dev->vu_server, ctx);
53
- (src_cluster_offset + offset_in_cluster));
88
- aio_context_release(ctx);
54
assert((offset_in_cluster & ~BDRV_SECTOR_MASK) == 0);
89
+ vhost_user_server_attach_aio_context(&vub_dev->vu_server, ctx);
55
assert((bytes & ~BDRV_SECTOR_MASK) == 0);
90
}
56
assert(s->crypto);
91
57
- if (qcrypto_block_encrypt(s->crypto, offset, buffer, bytes, NULL) < 0) {
92
static void blk_aio_detach(void *opaque)
58
+ if (qcow2_co_encrypt(bs, cluster_offset,
93
{
59
+ src_cluster_offset + offset_in_cluster,
94
VuBlockDev *vub_dev = opaque;
60
+ buffer, bytes) < 0) {
95
- AioContext *ctx = vub_dev->vu_server.ctx;
61
return false;
96
- aio_context_acquire(ctx);
62
}
97
- vhost_user_server_set_aio_context(&vub_dev->vu_server, NULL);
63
}
98
- aio_context_release(ctx);
64
diff --git a/block/qcow2-threads.c b/block/qcow2-threads.c
99
+ vhost_user_server_detach_aio_context(&vub_dev->vu_server);
100
}
101
102
static void
103
diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
65
index XXXXXXX..XXXXXXX 100644
104
index XXXXXXX..XXXXXXX 100644
66
--- a/block/qcow2-threads.c
105
--- a/util/vhost-user-server.c
67
+++ b/block/qcow2-threads.c
106
+++ b/util/vhost-user-server.c
68
@@ -XXX,XX +XXX,XX @@
107
@@ -XXX,XX +XXX,XX @@
69
108
*/
70
#include "qcow2.h"
109
#include "qemu/osdep.h"
71
#include "block/thread-pool.h"
110
#include "qemu/main-loop.h"
72
-
111
+#include "block/aio-wait.h"
73
-#define QCOW2_MAX_THREADS 4
112
#include "vhost-user-server.h"
74
+#include "crypto.h"
113
75
76
static int coroutine_fn
77
qcow2_co_process(BlockDriverState *bs, ThreadPoolFunc *func, void *arg)
78
@@ -XXX,XX +XXX,XX @@ qcow2_co_decompress(BlockDriverState *bs, void *dest, size_t dest_size,
79
return qcow2_co_do_compress(bs, dest, dest_size, src, src_size,
80
qcow2_decompress);
81
}
82
+
83
+
84
+/*
114
+/*
85
+ * Cryptography
115
+ * Theory of operation:
116
+ *
117
+ * VuServer is started and stopped by vhost_user_server_start() and
118
+ * vhost_user_server_stop() from the main loop thread. Starting the server
119
+ * opens a vhost-user UNIX domain socket and listens for incoming connections.
120
+ * Only one connection is allowed at a time.
121
+ *
122
+ * The connection is handled by the vu_client_trip() coroutine in the
123
+ * VuServer->ctx AioContext. The coroutine consists of a vu_dispatch() loop
124
+ * where libvhost-user calls vu_message_read() to receive the next vhost-user
125
+ * protocol messages over the UNIX domain socket.
126
+ *
127
+ * When virtqueues are set up libvhost-user calls set_watch() to monitor kick
128
+ * fds. These fds are also handled in the VuServer->ctx AioContext.
129
+ *
130
+ * Both vu_client_trip() and kick fd monitoring can be stopped by shutting down
131
+ * the socket connection. Shutting down the socket connection causes
132
+ * vu_message_read() to fail since no more data can be received from the socket.
133
+ * After vu_dispatch() fails, vu_client_trip() calls vu_deinit() to stop
134
+ * libvhost-user before terminating the coroutine. vu_deinit() calls
135
+ * remove_watch() to stop monitoring kick fds and this stops virtqueue
136
+ * processing.
137
+ *
138
+ * When vu_client_trip() has finished cleaning up it schedules a BH in the main
139
+ * loop thread to accept the next client connection.
140
+ *
141
+ * When libvhost-user detects an error it calls panic_cb() and sets the
142
+ * dev->broken flag. Both vu_client_trip() and kick fd processing stop when
143
+ * the dev->broken flag is set.
144
+ *
145
+ * It is possible to switch AioContexts using
146
+ * vhost_user_server_detach_aio_context() and
147
+ * vhost_user_server_attach_aio_context(). They stop monitoring fds in the old
148
+ * AioContext and resume monitoring in the new AioContext. The vu_client_trip()
149
+ * coroutine remains in a yielded state during the switch. This is made
150
+ * possible by QIOChannel's support for spurious coroutine re-entry in
151
+ * qio_channel_yield(). The coroutine will restart I/O when re-entered from the
152
+ * new AioContext.
86
+ */
153
+ */
87
+
154
+
155
static void vmsg_close_fds(VhostUserMsg *vmsg)
156
{
157
int i;
158
@@ -XXX,XX +XXX,XX @@ static void vmsg_unblock_fds(VhostUserMsg *vmsg)
159
}
160
}
161
162
-static void vu_accept(QIONetListener *listener, QIOChannelSocket *sioc,
163
- gpointer opaque);
164
-
165
-static void close_client(VuServer *server)
166
-{
167
- /*
168
- * Before closing the client
169
- *
170
- * 1. Let vu_client_trip stop processing new vhost-user msg
171
- *
172
- * 2. remove kick_handler
173
- *
174
- * 3. wait for the kick handler to be finished
175
- *
176
- * 4. wait for the current vhost-user msg to be finished processing
177
- */
178
-
179
- QIOChannelSocket *sioc = server->sioc;
180
- /* When this is set vu_client_trip will stop new processing vhost-user message */
181
- server->sioc = NULL;
182
-
183
- while (server->processing_msg) {
184
- if (server->ioc->read_coroutine) {
185
- server->ioc->read_coroutine = NULL;
186
- qio_channel_set_aio_fd_handler(server->ioc, server->ioc->ctx, NULL,
187
- NULL, server->ioc);
188
- server->processing_msg = false;
189
- }
190
- }
191
-
192
- vu_deinit(&server->vu_dev);
193
-
194
- /* vu_deinit() should have called remove_watch() */
195
- assert(QTAILQ_EMPTY(&server->vu_fd_watches));
196
-
197
- object_unref(OBJECT(sioc));
198
- object_unref(OBJECT(server->ioc));
199
-}
200
-
201
static void panic_cb(VuDev *vu_dev, const char *buf)
202
{
203
- VuServer *server = container_of(vu_dev, VuServer, vu_dev);
204
-
205
- /* avoid while loop in close_client */
206
- server->processing_msg = false;
207
-
208
- if (buf) {
209
- error_report("vu_panic: %s", buf);
210
- }
211
-
212
- if (server->sioc) {
213
- close_client(server);
214
- }
215
-
216
- /*
217
- * Set the callback function for network listener so another
218
- * vhost-user client can connect to this server
219
- */
220
- qio_net_listener_set_client_func(server->listener,
221
- vu_accept,
222
- server,
223
- NULL);
224
+ error_report("vu_panic: %s", buf);
225
}
226
227
static bool coroutine_fn
228
@@ -XXX,XX +XXX,XX @@ fail:
229
return false;
230
}
231
232
-
233
-static void vu_client_start(VuServer *server);
234
static coroutine_fn void vu_client_trip(void *opaque)
235
{
236
VuServer *server = opaque;
237
+ VuDev *vu_dev = &server->vu_dev;
238
239
- while (!server->aio_context_changed && server->sioc) {
240
- server->processing_msg = true;
241
- vu_dispatch(&server->vu_dev);
242
- server->processing_msg = false;
243
+ while (!vu_dev->broken && vu_dispatch(vu_dev)) {
244
+ /* Keep running */
245
}
246
247
- if (server->aio_context_changed && server->sioc) {
248
- server->aio_context_changed = false;
249
- vu_client_start(server);
250
- }
251
-}
252
+ vu_deinit(vu_dev);
253
+
254
+ /* vu_deinit() should have called remove_watch() */
255
+ assert(QTAILQ_EMPTY(&server->vu_fd_watches));
256
+
257
+ object_unref(OBJECT(server->sioc));
258
+ server->sioc = NULL;
259
260
-static void vu_client_start(VuServer *server)
261
-{
262
- server->co_trip = qemu_coroutine_create(vu_client_trip, server);
263
- aio_co_enter(server->ctx, server->co_trip);
264
+ object_unref(OBJECT(server->ioc));
265
+ server->ioc = NULL;
266
+
267
+ server->co_trip = NULL;
268
+ if (server->restart_listener_bh) {
269
+ qemu_bh_schedule(server->restart_listener_bh);
270
+ }
271
+ aio_wait_kick();
272
}
273
274
/*
275
@@ -XXX,XX +XXX,XX @@ static void vu_client_start(VuServer *server)
276
static void kick_handler(void *opaque)
277
{
278
VuFdWatch *vu_fd_watch = opaque;
279
- vu_fd_watch->processing = true;
280
- vu_fd_watch->cb(vu_fd_watch->vu_dev, 0, vu_fd_watch->pvt);
281
- vu_fd_watch->processing = false;
282
+ VuDev *vu_dev = vu_fd_watch->vu_dev;
283
+
284
+ vu_fd_watch->cb(vu_dev, 0, vu_fd_watch->pvt);
285
+
286
+ /* Stop vu_client_trip() if an error occurred in vu_fd_watch->cb() */
287
+ if (vu_dev->broken) {
288
+ VuServer *server = container_of(vu_dev, VuServer, vu_dev);
289
+
290
+ qio_channel_shutdown(server->ioc, QIO_CHANNEL_SHUTDOWN_BOTH, NULL);
291
+ }
292
}
293
294
-
295
static VuFdWatch *find_vu_fd_watch(VuServer *server, int fd)
296
{
297
298
@@ -XXX,XX +XXX,XX @@ static void vu_accept(QIONetListener *listener, QIOChannelSocket *sioc,
299
qio_channel_set_name(QIO_CHANNEL(sioc), "vhost-user client");
300
server->ioc = QIO_CHANNEL(sioc);
301
object_ref(OBJECT(server->ioc));
302
- qio_channel_attach_aio_context(server->ioc, server->ctx);
303
+
304
+ /* TODO vu_message_write() spins if non-blocking! */
305
qio_channel_set_blocking(server->ioc, false, NULL);
306
- vu_client_start(server);
307
+
308
+ server->co_trip = qemu_coroutine_create(vu_client_trip, server);
309
+
310
+ aio_context_acquire(server->ctx);
311
+ vhost_user_server_attach_aio_context(server, server->ctx);
312
+ aio_context_release(server->ctx);
313
}
314
315
-
316
void vhost_user_server_stop(VuServer *server)
317
{
318
+ aio_context_acquire(server->ctx);
319
+
320
+ qemu_bh_delete(server->restart_listener_bh);
321
+ server->restart_listener_bh = NULL;
322
+
323
if (server->sioc) {
324
- close_client(server);
325
+ VuFdWatch *vu_fd_watch;
326
+
327
+ QTAILQ_FOREACH(vu_fd_watch, &server->vu_fd_watches, next) {
328
+ aio_set_fd_handler(server->ctx, vu_fd_watch->fd, true,
329
+ NULL, NULL, NULL, vu_fd_watch);
330
+ }
331
+
332
+ qio_channel_shutdown(server->ioc, QIO_CHANNEL_SHUTDOWN_BOTH, NULL);
333
+
334
+ AIO_WAIT_WHILE(server->ctx, server->co_trip);
335
}
336
337
+ aio_context_release(server->ctx);
338
+
339
if (server->listener) {
340
qio_net_listener_disconnect(server->listener);
341
object_unref(OBJECT(server->listener));
342
}
343
+}
344
+
88
+/*
345
+/*
89
+ * Qcow2EncDecFunc: common prototype of qcrypto_block_encrypt() and
346
+ * Allow the next client to connect to the server. Called from a BH in the main
90
+ * qcrypto_block_decrypt() functions.
347
+ * loop.
91
+ */
348
+ */
92
+typedef int (*Qcow2EncDecFunc)(QCryptoBlock *block, uint64_t offset,
349
+static void restart_listener_bh(void *opaque)
93
+ uint8_t *buf, size_t len, Error **errp);
94
+
95
+typedef struct Qcow2EncDecData {
96
+ QCryptoBlock *block;
97
+ uint64_t offset;
98
+ uint8_t *buf;
99
+ size_t len;
100
+
101
+ Qcow2EncDecFunc func;
102
+} Qcow2EncDecData;
103
+
104
+static int qcow2_encdec_pool_func(void *opaque)
105
+{
350
+{
106
+ Qcow2EncDecData *data = opaque;
351
+ VuServer *server = opaque;
107
+
352
108
+ return data->func(data->block, data->offset, data->buf, data->len, NULL);
353
+ qio_net_listener_set_client_func(server->listener, vu_accept, server,
354
+ NULL);
355
}
356
357
-void vhost_user_server_set_aio_context(VuServer *server, AioContext *ctx)
358
+/* Called with ctx acquired */
359
+void vhost_user_server_attach_aio_context(VuServer *server, AioContext *ctx)
360
{
361
- VuFdWatch *vu_fd_watch, *next;
362
- void *opaque = NULL;
363
- IOHandler *io_read = NULL;
364
- bool attach;
365
+ VuFdWatch *vu_fd_watch;
366
367
- server->ctx = ctx ? ctx : qemu_get_aio_context();
368
+ server->ctx = ctx;
369
370
if (!server->sioc) {
371
- /* not yet serving any client*/
372
return;
373
}
374
375
- if (ctx) {
376
- qio_channel_attach_aio_context(server->ioc, ctx);
377
- server->aio_context_changed = true;
378
- io_read = kick_handler;
379
- attach = true;
380
- } else {
381
+ qio_channel_attach_aio_context(server->ioc, ctx);
382
+
383
+ QTAILQ_FOREACH(vu_fd_watch, &server->vu_fd_watches, next) {
384
+ aio_set_fd_handler(ctx, vu_fd_watch->fd, true, kick_handler, NULL,
385
+ NULL, vu_fd_watch);
386
+ }
387
+
388
+ aio_co_schedule(ctx, server->co_trip);
109
+}
389
+}
110
+
390
+
111
+static int coroutine_fn
391
+/* Called with server->ctx acquired */
112
+qcow2_co_encdec(BlockDriverState *bs, uint64_t file_cluster_offset,
392
+void vhost_user_server_detach_aio_context(VuServer *server)
113
+ uint64_t offset, void *buf, size_t len, Qcow2EncDecFunc func)
114
+{
393
+{
115
+ BDRVQcow2State *s = bs->opaque;
394
+ if (server->sioc) {
116
+ Qcow2EncDecData arg = {
395
+ VuFdWatch *vu_fd_watch;
117
+ .block = s->crypto,
396
+
118
+ .offset = s->crypt_physical_offset ?
397
+ QTAILQ_FOREACH(vu_fd_watch, &server->vu_fd_watches, next) {
119
+ file_cluster_offset + offset_into_cluster(s, offset) :
398
+ aio_set_fd_handler(server->ctx, vu_fd_watch->fd, true,
120
+ offset,
399
+ NULL, NULL, NULL, vu_fd_watch);
121
+ .buf = buf,
400
+ }
122
+ .len = len,
401
+
123
+ .func = func,
402
qio_channel_detach_aio_context(server->ioc);
124
+ };
403
- /* server->ioc->ctx keeps the old AioConext */
125
+
404
- ctx = server->ioc->ctx;
126
+ return qcow2_co_process(bs, qcow2_encdec_pool_func, &arg);
405
- attach = false;
127
+}
406
}
128
+
407
129
+int coroutine_fn
408
- QTAILQ_FOREACH_SAFE(vu_fd_watch, &server->vu_fd_watches, next, next) {
130
+qcow2_co_encrypt(BlockDriverState *bs, uint64_t file_cluster_offset,
409
- if (vu_fd_watch->cb) {
131
+ uint64_t offset, void *buf, size_t len)
410
- opaque = attach ? vu_fd_watch : NULL;
132
+{
411
- aio_set_fd_handler(ctx, vu_fd_watch->fd, true,
133
+ return qcow2_co_encdec(bs, file_cluster_offset, offset, buf, len,
412
- io_read, NULL, NULL,
134
+ qcrypto_block_encrypt);
413
- opaque);
135
+}
414
- }
136
+
415
- }
137
+int coroutine_fn
416
+ server->ctx = NULL;
138
+qcow2_co_decrypt(BlockDriverState *bs, uint64_t file_cluster_offset,
417
}
139
+ uint64_t offset, void *buf, size_t len)
418
140
+{
419
-
141
+ return qcow2_co_encdec(bs, file_cluster_offset, offset, buf, len,
420
bool vhost_user_server_start(VuServer *server,
142
+ qcrypto_block_decrypt);
421
SocketAddress *socket_addr,
143
+}
422
AioContext *ctx,
144
diff --git a/block/qcow2.c b/block/qcow2.c
423
@@ -XXX,XX +XXX,XX @@ bool vhost_user_server_start(VuServer *server,
145
index XXXXXXX..XXXXXXX 100644
424
const VuDevIface *vu_iface,
146
--- a/block/qcow2.c
425
Error **errp)
147
+++ b/block/qcow2.c
426
{
148
@@ -XXX,XX +XXX,XX @@ static int qcow2_read_extensions(BlockDriverState *bs, uint64_t start_offset,
427
+ QEMUBH *bh;
149
}
428
QIONetListener *listener = qio_net_listener_new();
150
s->crypto = qcrypto_block_open(s->crypto_opts, "encrypt.",
429
if (qio_net_listener_open_sync(listener, socket_addr, 1,
151
qcow2_crypto_hdr_read_func,
430
errp) < 0) {
152
- bs, cflags, 1, errp);
431
@@ -XXX,XX +XXX,XX @@ bool vhost_user_server_start(VuServer *server,
153
+ bs, cflags, QCOW2_MAX_THREADS, errp);
432
return false;
154
if (!s->crypto) {
433
}
155
return -EINVAL;
434
156
}
435
+ bh = qemu_bh_new(restart_listener_bh, server);
157
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn qcow2_do_open(BlockDriverState *bs, QDict *options,
436
+
158
cflags |= QCRYPTO_BLOCK_OPEN_NO_IO;
437
/* zero out unspecified fields */
159
}
438
*server = (VuServer) {
160
s->crypto = qcrypto_block_open(s->crypto_opts, "encrypt.",
439
.listener = listener,
161
- NULL, NULL, cflags, 1, errp);
440
+ .restart_listener_bh = bh,
162
+ NULL, NULL, cflags,
441
.vu_iface = vu_iface,
163
+ QCOW2_MAX_THREADS, errp);
442
.max_queues = max_queues,
164
if (!s->crypto) {
443
.ctx = ctx,
165
ret = -EINVAL;
166
goto fail;
167
@@ -XXX,XX +XXX,XX @@ static coroutine_fn int qcow2_co_preadv(BlockDriverState *bs, uint64_t offset,
168
assert(s->crypto);
169
assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
170
assert((cur_bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
171
- if (qcrypto_block_decrypt(s->crypto,
172
- (s->crypt_physical_offset ?
173
- cluster_offset + offset_in_cluster :
174
- offset),
175
- cluster_data,
176
- cur_bytes,
177
- NULL) < 0) {
178
+ if (qcow2_co_decrypt(bs, cluster_offset, offset,
179
+ cluster_data, cur_bytes) < 0) {
180
ret = -EIO;
181
goto fail;
182
}
183
@@ -XXX,XX +XXX,XX @@ static coroutine_fn int qcow2_co_pwritev(BlockDriverState *bs, uint64_t offset,
184
QCOW_MAX_CRYPT_CLUSTERS * s->cluster_size);
185
qemu_iovec_to_buf(&hd_qiov, 0, cluster_data, hd_qiov.size);
186
187
- if (qcrypto_block_encrypt(s->crypto,
188
- (s->crypt_physical_offset ?
189
- cluster_offset + offset_in_cluster :
190
- offset),
191
- cluster_data,
192
- cur_bytes, NULL) < 0) {
193
+ if (qcow2_co_encrypt(bs, cluster_offset, offset,
194
+ cluster_data, cur_bytes) < 0) {
195
ret = -EIO;
196
goto out_unlocked;
197
}
198
--
444
--
199
2.21.0
445
2.26.2
200
446
201
diff view generated by jsdifflib
New patch
1
Propagate the flush return value since errors are possible.
1
2
3
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
4
Message-id: 20200924151549.913737-11-stefanha@redhat.com
5
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
6
---
7
block/export/vhost-user-blk-server.c | 11 +++++++----
8
1 file changed, 7 insertions(+), 4 deletions(-)
9
10
diff --git a/block/export/vhost-user-blk-server.c b/block/export/vhost-user-blk-server.c
11
index XXXXXXX..XXXXXXX 100644
12
--- a/block/export/vhost-user-blk-server.c
13
+++ b/block/export/vhost-user-blk-server.c
14
@@ -XXX,XX +XXX,XX @@ vu_block_discard_write_zeroes(VuBlockReq *req, struct iovec *iov,
15
return -EINVAL;
16
}
17
18
-static void coroutine_fn vu_block_flush(VuBlockReq *req)
19
+static int coroutine_fn vu_block_flush(VuBlockReq *req)
20
{
21
VuBlockDev *vdev_blk = get_vu_block_device_by_server(req->server);
22
BlockBackend *backend = vdev_blk->backend;
23
- blk_co_flush(backend);
24
+ return blk_co_flush(backend);
25
}
26
27
static void coroutine_fn vu_block_virtio_process_req(void *opaque)
28
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn vu_block_virtio_process_req(void *opaque)
29
break;
30
}
31
case VIRTIO_BLK_T_FLUSH:
32
- vu_block_flush(req);
33
- req->in->status = VIRTIO_BLK_S_OK;
34
+ if (vu_block_flush(req) == 0) {
35
+ req->in->status = VIRTIO_BLK_S_OK;
36
+ } else {
37
+ req->in->status = VIRTIO_BLK_S_IOERR;
38
+ }
39
break;
40
case VIRTIO_BLK_T_GET_ID: {
41
size_t size = MIN(iov_size(&elem->in_sg[0], in_num),
42
--
43
2.26.2
44
diff view generated by jsdifflib
1
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
1
Use the new QAPI block exports API instead of defining our own QOM
2
2
objects.
3
Use thread_pool_submit_co, instead of reinventing it here. Note, that
3
4
thread_pool_submit_aio() never returns NULL, so checking it was an
4
This is a large change because the lifecycle of VuBlockDev needs to
5
extra thing.
5
follow BlockExportDriver. QOM properties are replaced by QAPI options
6
6
objects.
7
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
7
8
Reviewed-by: Alberto Garcia <berto@igalia.com>
8
VuBlockDev is renamed VuBlkExport and contains a BlockExport field.
9
Reviewed-by: Max Reitz <mreitz@redhat.com>
9
Several fields can be dropped since BlockExport already has equivalents.
10
Message-id: 20190506142741.41731-4-vsementsov@virtuozzo.com
10
11
Signed-off-by: Max Reitz <mreitz@redhat.com>
11
The file names and meson build integration will be adjusted in a future
12
patch. libvhost-user should probably be built as a static library that
13
is linked into QEMU instead of as a .c file that results in duplicate
14
compilation.
15
16
The new command-line syntax is:
17
18
$ qemu-storage-daemon \
19
--blockdev file,node-name=drive0,filename=test.img \
20
--export vhost-user-blk,node-name=drive0,id=export0,unix-socket=/tmp/vhost-user-blk.sock
21
22
Note that unix-socket is optional because we may wish to accept chardevs
23
too in the future.
24
25
Markus noted that supported address families are not explicit in the
26
QAPI schema. It is unlikely that support for more address families will
27
be added since file descriptor passing is required and few address
28
families support it. If a new address family needs to be added, then the
29
QAPI 'features' syntax can be used to advertize them.
30
31
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
32
Acked-by: Markus Armbruster <armbru@redhat.com>
33
Message-id: 20200924151549.913737-12-stefanha@redhat.com
34
[Skip test on big-endian host architectures because this device doesn't
35
support them yet (as already mentioned in a code comment).
36
--Stefan]
37
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
12
---
38
---
13
block/qcow2-threads.c | 17 ++---------------
39
qapi/block-export.json | 21 +-
14
1 file changed, 2 insertions(+), 15 deletions(-)
40
block/export/vhost-user-blk-server.h | 23 +-
15
41
block/export/export.c | 6 +
16
diff --git a/block/qcow2-threads.c b/block/qcow2-threads.c
42
block/export/vhost-user-blk-server.c | 452 +++++++--------------------
43
util/vhost-user-server.c | 10 +-
44
block/export/meson.build | 1 +
45
block/meson.build | 1 -
46
7 files changed, 156 insertions(+), 358 deletions(-)
47
48
diff --git a/qapi/block-export.json b/qapi/block-export.json
17
index XXXXXXX..XXXXXXX 100644
49
index XXXXXXX..XXXXXXX 100644
18
--- a/block/qcow2-threads.c
50
--- a/qapi/block-export.json
19
+++ b/block/qcow2-threads.c
51
+++ b/qapi/block-export.json
20
@@ -XXX,XX +XXX,XX @@ static int qcow2_compress_pool_func(void *opaque)
52
@@ -XXX,XX +XXX,XX @@
53
'data': { '*name': 'str', '*description': 'str',
54
'*bitmap': 'str' } }
55
56
+##
57
+# @BlockExportOptionsVhostUserBlk:
58
+#
59
+# A vhost-user-blk block export.
60
+#
61
+# @addr: The vhost-user socket on which to listen. Both 'unix' and 'fd'
62
+# SocketAddress types are supported. Passed fds must be UNIX domain
63
+# sockets.
64
+# @logical-block-size: Logical block size in bytes. Defaults to 512 bytes.
65
+#
66
+# Since: 5.2
67
+##
68
+{ 'struct': 'BlockExportOptionsVhostUserBlk',
69
+ 'data': { 'addr': 'SocketAddress', '*logical-block-size': 'size' } }
70
+
71
##
72
# @NbdServerAddOptions:
73
#
74
@@ -XXX,XX +XXX,XX @@
75
# An enumeration of block export types
76
#
77
# @nbd: NBD export
78
+# @vhost-user-blk: vhost-user-blk export (since 5.2)
79
#
80
# Since: 4.2
81
##
82
{ 'enum': 'BlockExportType',
83
- 'data': [ 'nbd' ] }
84
+ 'data': [ 'nbd', 'vhost-user-blk' ] }
85
86
##
87
# @BlockExportOptions:
88
@@ -XXX,XX +XXX,XX @@
89
'*writethrough': 'bool' },
90
'discriminator': 'type',
91
'data': {
92
- 'nbd': 'BlockExportOptionsNbd'
93
+ 'nbd': 'BlockExportOptionsNbd',
94
+ 'vhost-user-blk': 'BlockExportOptionsVhostUserBlk'
95
} }
96
97
##
98
diff --git a/block/export/vhost-user-blk-server.h b/block/export/vhost-user-blk-server.h
99
index XXXXXXX..XXXXXXX 100644
100
--- a/block/export/vhost-user-blk-server.h
101
+++ b/block/export/vhost-user-blk-server.h
102
@@ -XXX,XX +XXX,XX @@
103
104
#ifndef VHOST_USER_BLK_SERVER_H
105
#define VHOST_USER_BLK_SERVER_H
106
-#include "util/vhost-user-server.h"
107
108
-typedef struct VuBlockDev VuBlockDev;
109
-#define TYPE_VHOST_USER_BLK_SERVER "vhost-user-blk-server"
110
-#define VHOST_USER_BLK_SERVER(obj) \
111
- OBJECT_CHECK(VuBlockDev, obj, TYPE_VHOST_USER_BLK_SERVER)
112
+#include "block/export.h"
113
114
-/* vhost user block device */
115
-struct VuBlockDev {
116
- Object parent_obj;
117
- char *node_name;
118
- SocketAddress *addr;
119
- AioContext *ctx;
120
- VuServer vu_server;
121
- bool running;
122
- uint32_t blk_size;
123
- BlockBackend *backend;
124
- QIOChannelSocket *sioc;
125
- QTAILQ_ENTRY(VuBlockDev) next;
126
- struct virtio_blk_config blkcfg;
127
- bool writable;
128
-};
129
+/* For block/export/export.c */
130
+extern const BlockExportDriver blk_exp_vhost_user_blk;
131
132
#endif /* VHOST_USER_BLK_SERVER_H */
133
diff --git a/block/export/export.c b/block/export/export.c
134
index XXXXXXX..XXXXXXX 100644
135
--- a/block/export/export.c
136
+++ b/block/export/export.c
137
@@ -XXX,XX +XXX,XX @@
138
#include "sysemu/block-backend.h"
139
#include "block/export.h"
140
#include "block/nbd.h"
141
+#if CONFIG_LINUX
142
+#include "block/export/vhost-user-blk-server.h"
143
+#endif
144
#include "qapi/error.h"
145
#include "qapi/qapi-commands-block-export.h"
146
#include "qapi/qapi-events-block-export.h"
147
@@ -XXX,XX +XXX,XX @@
148
149
static const BlockExportDriver *blk_exp_drivers[] = {
150
&blk_exp_nbd,
151
+#if CONFIG_LINUX
152
+ &blk_exp_vhost_user_blk,
153
+#endif
154
};
155
156
/* Only accessed from the main thread */
157
diff --git a/block/export/vhost-user-blk-server.c b/block/export/vhost-user-blk-server.c
158
index XXXXXXX..XXXXXXX 100644
159
--- a/block/export/vhost-user-blk-server.c
160
+++ b/block/export/vhost-user-blk-server.c
161
@@ -XXX,XX +XXX,XX @@
162
*/
163
#include "qemu/osdep.h"
164
#include "block/block.h"
165
+#include "contrib/libvhost-user/libvhost-user.h"
166
+#include "standard-headers/linux/virtio_blk.h"
167
+#include "util/vhost-user-server.h"
168
#include "vhost-user-blk-server.h"
169
#include "qapi/error.h"
170
#include "qom/object_interfaces.h"
171
@@ -XXX,XX +XXX,XX @@ struct virtio_blk_inhdr {
172
unsigned char status;
173
};
174
175
-typedef struct VuBlockReq {
176
+typedef struct VuBlkReq {
177
VuVirtqElement elem;
178
int64_t sector_num;
179
size_t size;
180
@@ -XXX,XX +XXX,XX @@ typedef struct VuBlockReq {
181
struct virtio_blk_outhdr out;
182
VuServer *server;
183
struct VuVirtq *vq;
184
-} VuBlockReq;
185
+} VuBlkReq;
186
187
-static void vu_block_req_complete(VuBlockReq *req)
188
+/* vhost user block device */
189
+typedef struct {
190
+ BlockExport export;
191
+ VuServer vu_server;
192
+ uint32_t blk_size;
193
+ QIOChannelSocket *sioc;
194
+ struct virtio_blk_config blkcfg;
195
+ bool writable;
196
+} VuBlkExport;
197
+
198
+static void vu_blk_req_complete(VuBlkReq *req)
199
{
200
VuDev *vu_dev = &req->server->vu_dev;
201
202
@@ -XXX,XX +XXX,XX @@ static void vu_block_req_complete(VuBlockReq *req)
203
free(req);
204
}
205
206
-static VuBlockDev *get_vu_block_device_by_server(VuServer *server)
207
-{
208
- return container_of(server, VuBlockDev, vu_server);
209
-}
210
-
211
static int coroutine_fn
212
-vu_block_discard_write_zeroes(VuBlockReq *req, struct iovec *iov,
213
- uint32_t iovcnt, uint32_t type)
214
+vu_blk_discard_write_zeroes(BlockBackend *blk, struct iovec *iov,
215
+ uint32_t iovcnt, uint32_t type)
216
{
217
struct virtio_blk_discard_write_zeroes desc;
218
ssize_t size = iov_to_buf(iov, iovcnt, 0, &desc, sizeof(desc));
219
@@ -XXX,XX +XXX,XX @@ vu_block_discard_write_zeroes(VuBlockReq *req, struct iovec *iov,
220
return -EINVAL;
221
}
222
223
- VuBlockDev *vdev_blk = get_vu_block_device_by_server(req->server);
224
uint64_t range[2] = { le64_to_cpu(desc.sector) << 9,
225
le32_to_cpu(desc.num_sectors) << 9 };
226
if (type == VIRTIO_BLK_T_DISCARD) {
227
- if (blk_co_pdiscard(vdev_blk->backend, range[0], range[1]) == 0) {
228
+ if (blk_co_pdiscard(blk, range[0], range[1]) == 0) {
229
return 0;
230
}
231
} else if (type == VIRTIO_BLK_T_WRITE_ZEROES) {
232
- if (blk_co_pwrite_zeroes(vdev_blk->backend,
233
- range[0], range[1], 0) == 0) {
234
+ if (blk_co_pwrite_zeroes(blk, range[0], range[1], 0) == 0) {
235
return 0;
236
}
237
}
238
@@ -XXX,XX +XXX,XX @@ vu_block_discard_write_zeroes(VuBlockReq *req, struct iovec *iov,
239
return -EINVAL;
240
}
241
242
-static int coroutine_fn vu_block_flush(VuBlockReq *req)
243
+static void coroutine_fn vu_blk_virtio_process_req(void *opaque)
244
{
245
- VuBlockDev *vdev_blk = get_vu_block_device_by_server(req->server);
246
- BlockBackend *backend = vdev_blk->backend;
247
- return blk_co_flush(backend);
248
-}
249
-
250
-static void coroutine_fn vu_block_virtio_process_req(void *opaque)
251
-{
252
- VuBlockReq *req = opaque;
253
+ VuBlkReq *req = opaque;
254
VuServer *server = req->server;
255
VuVirtqElement *elem = &req->elem;
256
uint32_t type;
257
258
- VuBlockDev *vdev_blk = get_vu_block_device_by_server(server);
259
- BlockBackend *backend = vdev_blk->backend;
260
+ VuBlkExport *vexp = container_of(server, VuBlkExport, vu_server);
261
+ BlockBackend *blk = vexp->export.blk;
262
263
struct iovec *in_iov = elem->in_sg;
264
struct iovec *out_iov = elem->out_sg;
265
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn vu_block_virtio_process_req(void *opaque)
266
bool is_write = type & VIRTIO_BLK_T_OUT;
267
req->sector_num = le64_to_cpu(req->out.sector);
268
269
- int64_t offset = req->sector_num * vdev_blk->blk_size;
270
+ if (is_write && !vexp->writable) {
271
+ req->in->status = VIRTIO_BLK_S_IOERR;
272
+ break;
273
+ }
274
+
275
+ int64_t offset = req->sector_num * vexp->blk_size;
276
QEMUIOVector qiov;
277
if (is_write) {
278
qemu_iovec_init_external(&qiov, out_iov, out_num);
279
- ret = blk_co_pwritev(backend, offset, qiov.size,
280
- &qiov, 0);
281
+ ret = blk_co_pwritev(blk, offset, qiov.size, &qiov, 0);
282
} else {
283
qemu_iovec_init_external(&qiov, in_iov, in_num);
284
- ret = blk_co_preadv(backend, offset, qiov.size,
285
- &qiov, 0);
286
+ ret = blk_co_preadv(blk, offset, qiov.size, &qiov, 0);
287
}
288
if (ret >= 0) {
289
req->in->status = VIRTIO_BLK_S_OK;
290
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn vu_block_virtio_process_req(void *opaque)
291
break;
292
}
293
case VIRTIO_BLK_T_FLUSH:
294
- if (vu_block_flush(req) == 0) {
295
+ if (blk_co_flush(blk) == 0) {
296
req->in->status = VIRTIO_BLK_S_OK;
297
} else {
298
req->in->status = VIRTIO_BLK_S_IOERR;
299
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn vu_block_virtio_process_req(void *opaque)
300
case VIRTIO_BLK_T_DISCARD:
301
case VIRTIO_BLK_T_WRITE_ZEROES: {
302
int rc;
303
- rc = vu_block_discard_write_zeroes(req, &elem->out_sg[1],
304
- out_num, type);
305
+
306
+ if (!vexp->writable) {
307
+ req->in->status = VIRTIO_BLK_S_IOERR;
308
+ break;
309
+ }
310
+
311
+ rc = vu_blk_discard_write_zeroes(blk, &elem->out_sg[1], out_num, type);
312
if (rc == 0) {
313
req->in->status = VIRTIO_BLK_S_OK;
314
} else {
315
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn vu_block_virtio_process_req(void *opaque)
316
break;
317
}
318
319
- vu_block_req_complete(req);
320
+ vu_blk_req_complete(req);
321
return;
322
323
err:
324
- free(elem);
325
+ free(req);
326
}
327
328
-static void vu_block_process_vq(VuDev *vu_dev, int idx)
329
+static void vu_blk_process_vq(VuDev *vu_dev, int idx)
330
{
331
VuServer *server = container_of(vu_dev, VuServer, vu_dev);
332
VuVirtq *vq = vu_get_queue(vu_dev, idx);
333
334
while (1) {
335
- VuBlockReq *req;
336
+ VuBlkReq *req;
337
338
- req = vu_queue_pop(vu_dev, vq, sizeof(VuBlockReq));
339
+ req = vu_queue_pop(vu_dev, vq, sizeof(VuBlkReq));
340
if (!req) {
341
break;
342
}
343
@@ -XXX,XX +XXX,XX @@ static void vu_block_process_vq(VuDev *vu_dev, int idx)
344
req->vq = vq;
345
346
Coroutine *co =
347
- qemu_coroutine_create(vu_block_virtio_process_req, req);
348
+ qemu_coroutine_create(vu_blk_virtio_process_req, req);
349
qemu_coroutine_enter(co);
350
}
351
}
352
353
-static void vu_block_queue_set_started(VuDev *vu_dev, int idx, bool started)
354
+static void vu_blk_queue_set_started(VuDev *vu_dev, int idx, bool started)
355
{
356
VuVirtq *vq;
357
358
assert(vu_dev);
359
360
vq = vu_get_queue(vu_dev, idx);
361
- vu_set_queue_handler(vu_dev, vq, started ? vu_block_process_vq : NULL);
362
+ vu_set_queue_handler(vu_dev, vq, started ? vu_blk_process_vq : NULL);
363
}
364
365
-static uint64_t vu_block_get_features(VuDev *dev)
366
+static uint64_t vu_blk_get_features(VuDev *dev)
367
{
368
uint64_t features;
369
VuServer *server = container_of(dev, VuServer, vu_dev);
370
- VuBlockDev *vdev_blk = get_vu_block_device_by_server(server);
371
+ VuBlkExport *vexp = container_of(server, VuBlkExport, vu_server);
372
features = 1ull << VIRTIO_BLK_F_SIZE_MAX |
373
1ull << VIRTIO_BLK_F_SEG_MAX |
374
1ull << VIRTIO_BLK_F_TOPOLOGY |
375
@@ -XXX,XX +XXX,XX @@ static uint64_t vu_block_get_features(VuDev *dev)
376
1ull << VIRTIO_RING_F_EVENT_IDX |
377
1ull << VHOST_USER_F_PROTOCOL_FEATURES;
378
379
- if (!vdev_blk->writable) {
380
+ if (!vexp->writable) {
381
features |= 1ull << VIRTIO_BLK_F_RO;
382
}
383
384
return features;
385
}
386
387
-static uint64_t vu_block_get_protocol_features(VuDev *dev)
388
+static uint64_t vu_blk_get_protocol_features(VuDev *dev)
389
{
390
return 1ull << VHOST_USER_PROTOCOL_F_CONFIG |
391
1ull << VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD;
392
}
393
394
static int
395
-vu_block_get_config(VuDev *vu_dev, uint8_t *config, uint32_t len)
396
+vu_blk_get_config(VuDev *vu_dev, uint8_t *config, uint32_t len)
397
{
398
+ /* TODO blkcfg must be little-endian for VIRTIO 1.0 */
399
VuServer *server = container_of(vu_dev, VuServer, vu_dev);
400
- VuBlockDev *vdev_blk = get_vu_block_device_by_server(server);
401
- memcpy(config, &vdev_blk->blkcfg, len);
402
-
403
+ VuBlkExport *vexp = container_of(server, VuBlkExport, vu_server);
404
+ memcpy(config, &vexp->blkcfg, len);
21
return 0;
405
return 0;
22
}
406
}
23
407
24
-static void qcow2_compress_complete(void *opaque, int ret)
408
static int
25
-{
409
-vu_block_set_config(VuDev *vu_dev, const uint8_t *data,
26
- qemu_coroutine_enter(opaque);
410
+vu_blk_set_config(VuDev *vu_dev, const uint8_t *data,
27
-}
411
uint32_t offset, uint32_t size, uint32_t flags)
28
-
412
{
29
static ssize_t coroutine_fn
413
VuServer *server = container_of(vu_dev, VuServer, vu_dev);
30
qcow2_co_do_compress(BlockDriverState *bs, void *dest, size_t dest_size,
414
- VuBlockDev *vdev_blk = get_vu_block_device_by_server(server);
31
const void *src, size_t src_size, Qcow2CompressFunc func)
415
+ VuBlkExport *vexp = container_of(server, VuBlkExport, vu_server);
32
{
416
uint8_t wce;
33
BDRVQcow2State *s = bs->opaque;
417
34
- BlockAIOCB *acb;
418
/* don't support live migration */
35
ThreadPool *pool = aio_get_thread_pool(bdrv_get_aio_context(bs));
419
@@ -XXX,XX +XXX,XX @@ vu_block_set_config(VuDev *vu_dev, const uint8_t *data,
36
Qcow2CompressData arg = {
37
.dest = dest,
38
@@ -XXX,XX +XXX,XX @@ qcow2_co_do_compress(BlockDriverState *bs, void *dest, size_t dest_size,
39
}
420
}
40
421
41
s->nb_compress_threads++;
422
wce = *data;
42
- acb = thread_pool_submit_aio(pool, qcow2_compress_pool_func, &arg,
423
- vdev_blk->blkcfg.wce = wce;
43
- qcow2_compress_complete,
424
- blk_set_enable_write_cache(vdev_blk->backend, wce);
44
- qemu_coroutine_self());
425
+ vexp->blkcfg.wce = wce;
45
-
426
+ blk_set_enable_write_cache(vexp->export.blk, wce);
46
- if (!acb) {
427
return 0;
47
- s->nb_compress_threads--;
428
}
48
- return -EINVAL;
429
49
- }
430
@@ -XXX,XX +XXX,XX @@ vu_block_set_config(VuDev *vu_dev, const uint8_t *data,
50
- qemu_coroutine_yield();
431
* of vu_process_message.
51
+ thread_pool_submit_co(pool, qcow2_compress_pool_func, &arg);
432
*
52
s->nb_compress_threads--;
433
*/
434
-static int vu_block_process_msg(VuDev *dev, VhostUserMsg *vmsg, int *do_reply)
435
+static int vu_blk_process_msg(VuDev *dev, VhostUserMsg *vmsg, int *do_reply)
436
{
437
if (vmsg->request == VHOST_USER_NONE) {
438
dev->panic(dev, "disconnect");
439
@@ -XXX,XX +XXX,XX @@ static int vu_block_process_msg(VuDev *dev, VhostUserMsg *vmsg, int *do_reply)
440
return false;
441
}
442
443
-static const VuDevIface vu_block_iface = {
444
- .get_features = vu_block_get_features,
445
- .queue_set_started = vu_block_queue_set_started,
446
- .get_protocol_features = vu_block_get_protocol_features,
447
- .get_config = vu_block_get_config,
448
- .set_config = vu_block_set_config,
449
- .process_msg = vu_block_process_msg,
450
+static const VuDevIface vu_blk_iface = {
451
+ .get_features = vu_blk_get_features,
452
+ .queue_set_started = vu_blk_queue_set_started,
453
+ .get_protocol_features = vu_blk_get_protocol_features,
454
+ .get_config = vu_blk_get_config,
455
+ .set_config = vu_blk_set_config,
456
+ .process_msg = vu_blk_process_msg,
457
};
458
459
static void blk_aio_attached(AioContext *ctx, void *opaque)
460
{
461
- VuBlockDev *vub_dev = opaque;
462
- vhost_user_server_attach_aio_context(&vub_dev->vu_server, ctx);
463
+ VuBlkExport *vexp = opaque;
464
+ vhost_user_server_attach_aio_context(&vexp->vu_server, ctx);
465
}
466
467
static void blk_aio_detach(void *opaque)
468
{
469
- VuBlockDev *vub_dev = opaque;
470
- vhost_user_server_detach_aio_context(&vub_dev->vu_server);
471
+ VuBlkExport *vexp = opaque;
472
+ vhost_user_server_detach_aio_context(&vexp->vu_server);
473
}
474
475
static void
476
-vu_block_initialize_config(BlockDriverState *bs,
477
+vu_blk_initialize_config(BlockDriverState *bs,
478
struct virtio_blk_config *config, uint32_t blk_size)
479
{
480
config->capacity = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
481
@@ -XXX,XX +XXX,XX @@ vu_block_initialize_config(BlockDriverState *bs,
482
config->max_write_zeroes_seg = 1;
483
}
484
485
-static VuBlockDev *vu_block_init(VuBlockDev *vu_block_device, Error **errp)
486
+static void vu_blk_exp_request_shutdown(BlockExport *exp)
487
{
488
+ VuBlkExport *vexp = container_of(exp, VuBlkExport, export);
489
490
- BlockBackend *blk;
491
- Error *local_error = NULL;
492
- const char *node_name = vu_block_device->node_name;
493
- bool writable = vu_block_device->writable;
494
- uint64_t perm = BLK_PERM_CONSISTENT_READ;
495
- int ret;
496
-
497
- AioContext *ctx;
498
-
499
- BlockDriverState *bs = bdrv_lookup_bs(node_name, node_name, &local_error);
500
-
501
- if (!bs) {
502
- error_propagate(errp, local_error);
503
- return NULL;
504
- }
505
-
506
- if (bdrv_is_read_only(bs)) {
507
- writable = false;
508
- }
509
-
510
- if (writable) {
511
- perm |= BLK_PERM_WRITE;
512
- }
513
-
514
- ctx = bdrv_get_aio_context(bs);
515
- aio_context_acquire(ctx);
516
- bdrv_invalidate_cache(bs, NULL);
517
- aio_context_release(ctx);
518
-
519
- /*
520
- * Don't allow resize while the vhost user server is running,
521
- * otherwise we don't care what happens with the node.
522
- */
523
- blk = blk_new(bdrv_get_aio_context(bs), perm,
524
- BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE_UNCHANGED |
525
- BLK_PERM_WRITE | BLK_PERM_GRAPH_MOD);
526
- ret = blk_insert_bs(blk, bs, errp);
527
-
528
- if (ret < 0) {
529
- goto fail;
530
- }
531
-
532
- blk_set_enable_write_cache(blk, false);
533
-
534
- blk_set_allow_aio_context_change(blk, true);
535
-
536
- vu_block_device->blkcfg.wce = 0;
537
- vu_block_device->backend = blk;
538
- if (!vu_block_device->blk_size) {
539
- vu_block_device->blk_size = BDRV_SECTOR_SIZE;
540
- }
541
- vu_block_device->blkcfg.blk_size = vu_block_device->blk_size;
542
- blk_set_guest_block_size(blk, vu_block_device->blk_size);
543
- vu_block_initialize_config(bs, &vu_block_device->blkcfg,
544
- vu_block_device->blk_size);
545
- return vu_block_device;
546
-
547
-fail:
548
- blk_unref(blk);
549
- return NULL;
550
-}
551
-
552
-static void vu_block_deinit(VuBlockDev *vu_block_device)
553
-{
554
- if (vu_block_device->backend) {
555
- blk_remove_aio_context_notifier(vu_block_device->backend, blk_aio_attached,
556
- blk_aio_detach, vu_block_device);
557
- }
558
-
559
- blk_unref(vu_block_device->backend);
560
-}
561
-
562
-static void vhost_user_blk_server_stop(VuBlockDev *vu_block_device)
563
-{
564
- vhost_user_server_stop(&vu_block_device->vu_server);
565
- vu_block_deinit(vu_block_device);
566
-}
567
-
568
-static void vhost_user_blk_server_start(VuBlockDev *vu_block_device,
569
- Error **errp)
570
-{
571
- AioContext *ctx;
572
- SocketAddress *addr = vu_block_device->addr;
573
-
574
- if (!vu_block_init(vu_block_device, errp)) {
575
- return;
576
- }
577
-
578
- ctx = bdrv_get_aio_context(blk_bs(vu_block_device->backend));
579
-
580
- if (!vhost_user_server_start(&vu_block_device->vu_server, addr, ctx,
581
- VHOST_USER_BLK_MAX_QUEUES, &vu_block_iface,
582
- errp)) {
583
- goto error;
584
- }
585
-
586
- blk_add_aio_context_notifier(vu_block_device->backend, blk_aio_attached,
587
- blk_aio_detach, vu_block_device);
588
- vu_block_device->running = true;
589
- return;
590
-
591
- error:
592
- vu_block_deinit(vu_block_device);
593
-}
594
-
595
-static bool vu_prop_modifiable(VuBlockDev *vus, Error **errp)
596
-{
597
- if (vus->running) {
598
- error_setg(errp, "The property can't be modified "
599
- "while the server is running");
600
- return false;
601
- }
602
- return true;
603
-}
604
-
605
-static void vu_set_node_name(Object *obj, const char *value, Error **errp)
606
-{
607
- VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
608
-
609
- if (!vu_prop_modifiable(vus, errp)) {
610
- return;
611
- }
612
-
613
- if (vus->node_name) {
614
- g_free(vus->node_name);
615
- }
616
-
617
- vus->node_name = g_strdup(value);
618
-}
619
-
620
-static char *vu_get_node_name(Object *obj, Error **errp)
621
-{
622
- VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
623
- return g_strdup(vus->node_name);
624
-}
625
-
626
-static void free_socket_addr(SocketAddress *addr)
627
-{
628
- g_free(addr->u.q_unix.path);
629
- g_free(addr);
630
-}
631
-
632
-static void vu_set_unix_socket(Object *obj, const char *value,
633
- Error **errp)
634
-{
635
- VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
636
-
637
- if (!vu_prop_modifiable(vus, errp)) {
638
- return;
639
- }
640
-
641
- if (vus->addr) {
642
- free_socket_addr(vus->addr);
643
- }
644
-
645
- SocketAddress *addr = g_new0(SocketAddress, 1);
646
- addr->type = SOCKET_ADDRESS_TYPE_UNIX;
647
- addr->u.q_unix.path = g_strdup(value);
648
- vus->addr = addr;
649
+ vhost_user_server_stop(&vexp->vu_server);
650
}
651
652
-static char *vu_get_unix_socket(Object *obj, Error **errp)
653
+static int vu_blk_exp_create(BlockExport *exp, BlockExportOptions *opts,
654
+ Error **errp)
655
{
656
- VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
657
- return g_strdup(vus->addr->u.q_unix.path);
658
-}
659
-
660
-static bool vu_get_block_writable(Object *obj, Error **errp)
661
-{
662
- VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
663
- return vus->writable;
664
-}
665
-
666
-static void vu_set_block_writable(Object *obj, bool value, Error **errp)
667
-{
668
- VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
669
-
670
- if (!vu_prop_modifiable(vus, errp)) {
671
- return;
672
- }
673
-
674
- vus->writable = value;
675
-}
676
-
677
-static void vu_get_blk_size(Object *obj, Visitor *v, const char *name,
678
- void *opaque, Error **errp)
679
-{
680
- VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
681
- uint32_t value = vus->blk_size;
682
-
683
- visit_type_uint32(v, name, &value, errp);
684
-}
685
-
686
-static void vu_set_blk_size(Object *obj, Visitor *v, const char *name,
687
- void *opaque, Error **errp)
688
-{
689
- VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
690
-
691
+ VuBlkExport *vexp = container_of(exp, VuBlkExport, export);
692
+ BlockExportOptionsVhostUserBlk *vu_opts = &opts->u.vhost_user_blk;
693
Error *local_err = NULL;
694
- uint32_t value;
695
+ uint64_t logical_block_size;
696
697
- if (!vu_prop_modifiable(vus, errp)) {
698
- return;
699
- }
700
+ vexp->writable = opts->writable;
701
+ vexp->blkcfg.wce = 0;
702
703
- visit_type_uint32(v, name, &value, &local_err);
704
- if (local_err) {
705
- goto out;
706
+ if (vu_opts->has_logical_block_size) {
707
+ logical_block_size = vu_opts->logical_block_size;
708
+ } else {
709
+ logical_block_size = BDRV_SECTOR_SIZE;
710
}
711
-
712
- check_block_size(object_get_typename(obj), name, value, &local_err);
713
+ check_block_size(exp->id, "logical-block-size", logical_block_size,
714
+ &local_err);
715
if (local_err) {
716
- goto out;
717
+ error_propagate(errp, local_err);
718
+ return -EINVAL;
719
+ }
720
+ vexp->blk_size = logical_block_size;
721
+ blk_set_guest_block_size(exp->blk, logical_block_size);
722
+ vu_blk_initialize_config(blk_bs(exp->blk), &vexp->blkcfg,
723
+ logical_block_size);
53
+
724
+
54
qemu_co_queue_next(&s->compress_wait_queue);
725
+ blk_set_allow_aio_context_change(exp->blk, true);
55
726
+ blk_add_aio_context_notifier(exp->blk, blk_aio_attached, blk_aio_detach,
56
return arg.ret;
727
+ vexp);
728
+
729
+ if (!vhost_user_server_start(&vexp->vu_server, vu_opts->addr, exp->ctx,
730
+ VHOST_USER_BLK_MAX_QUEUES, &vu_blk_iface,
731
+ errp)) {
732
+ blk_remove_aio_context_notifier(exp->blk, blk_aio_attached,
733
+ blk_aio_detach, vexp);
734
+ return -EADDRNOTAVAIL;
735
}
736
737
- vus->blk_size = value;
738
-
739
-out:
740
- error_propagate(errp, local_err);
741
-}
742
-
743
-static void vhost_user_blk_server_instance_finalize(Object *obj)
744
-{
745
- VuBlockDev *vub = VHOST_USER_BLK_SERVER(obj);
746
-
747
- vhost_user_blk_server_stop(vub);
748
-
749
- /*
750
- * Unlike object_property_add_str, object_class_property_add_str
751
- * doesn't have a release method. Thus manual memory freeing is
752
- * needed.
753
- */
754
- free_socket_addr(vub->addr);
755
- g_free(vub->node_name);
756
-}
757
-
758
-static void vhost_user_blk_server_complete(UserCreatable *obj, Error **errp)
759
-{
760
- VuBlockDev *vub = VHOST_USER_BLK_SERVER(obj);
761
-
762
- vhost_user_blk_server_start(vub, errp);
763
+ return 0;
764
}
765
766
-static void vhost_user_blk_server_class_init(ObjectClass *klass,
767
- void *class_data)
768
+static void vu_blk_exp_delete(BlockExport *exp)
769
{
770
- UserCreatableClass *ucc = USER_CREATABLE_CLASS(klass);
771
- ucc->complete = vhost_user_blk_server_complete;
772
-
773
- object_class_property_add_bool(klass, "writable",
774
- vu_get_block_writable,
775
- vu_set_block_writable);
776
-
777
- object_class_property_add_str(klass, "node-name",
778
- vu_get_node_name,
779
- vu_set_node_name);
780
-
781
- object_class_property_add_str(klass, "unix-socket",
782
- vu_get_unix_socket,
783
- vu_set_unix_socket);
784
+ VuBlkExport *vexp = container_of(exp, VuBlkExport, export);
785
786
- object_class_property_add(klass, "logical-block-size", "uint32",
787
- vu_get_blk_size, vu_set_blk_size,
788
- NULL, NULL);
789
+ blk_remove_aio_context_notifier(exp->blk, blk_aio_attached, blk_aio_detach,
790
+ vexp);
791
}
792
793
-static const TypeInfo vhost_user_blk_server_info = {
794
- .name = TYPE_VHOST_USER_BLK_SERVER,
795
- .parent = TYPE_OBJECT,
796
- .instance_size = sizeof(VuBlockDev),
797
- .instance_finalize = vhost_user_blk_server_instance_finalize,
798
- .class_init = vhost_user_blk_server_class_init,
799
- .interfaces = (InterfaceInfo[]) {
800
- {TYPE_USER_CREATABLE},
801
- {}
802
- },
803
+const BlockExportDriver blk_exp_vhost_user_blk = {
804
+ .type = BLOCK_EXPORT_TYPE_VHOST_USER_BLK,
805
+ .instance_size = sizeof(VuBlkExport),
806
+ .create = vu_blk_exp_create,
807
+ .delete = vu_blk_exp_delete,
808
+ .request_shutdown = vu_blk_exp_request_shutdown,
809
};
810
-
811
-static void vhost_user_blk_server_register_types(void)
812
-{
813
- type_register_static(&vhost_user_blk_server_info);
814
-}
815
-
816
-type_init(vhost_user_blk_server_register_types)
817
diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
818
index XXXXXXX..XXXXXXX 100644
819
--- a/util/vhost-user-server.c
820
+++ b/util/vhost-user-server.c
821
@@ -XXX,XX +XXX,XX @@ bool vhost_user_server_start(VuServer *server,
822
Error **errp)
823
{
824
QEMUBH *bh;
825
- QIONetListener *listener = qio_net_listener_new();
826
+ QIONetListener *listener;
827
+
828
+ if (socket_addr->type != SOCKET_ADDRESS_TYPE_UNIX &&
829
+ socket_addr->type != SOCKET_ADDRESS_TYPE_FD) {
830
+ error_setg(errp, "Only socket address types 'unix' and 'fd' are supported");
831
+ return false;
832
+ }
833
+
834
+ listener = qio_net_listener_new();
835
if (qio_net_listener_open_sync(listener, socket_addr, 1,
836
errp) < 0) {
837
object_unref(OBJECT(listener));
838
diff --git a/block/export/meson.build b/block/export/meson.build
839
index XXXXXXX..XXXXXXX 100644
840
--- a/block/export/meson.build
841
+++ b/block/export/meson.build
842
@@ -1 +1,2 @@
843
block_ss.add(files('export.c'))
844
+block_ss.add(when: 'CONFIG_LINUX', if_true: files('vhost-user-blk-server.c', '../../contrib/libvhost-user/libvhost-user.c'))
845
diff --git a/block/meson.build b/block/meson.build
846
index XXXXXXX..XXXXXXX 100644
847
--- a/block/meson.build
848
+++ b/block/meson.build
849
@@ -XXX,XX +XXX,XX @@ block_ss.add(when: 'CONFIG_WIN32', if_true: files('file-win32.c', 'win32-aio.c')
850
block_ss.add(when: 'CONFIG_POSIX', if_true: [files('file-posix.c'), coref, iokit])
851
block_ss.add(when: 'CONFIG_LIBISCSI', if_true: files('iscsi-opts.c'))
852
block_ss.add(when: 'CONFIG_LINUX', if_true: files('nvme.c'))
853
-block_ss.add(when: 'CONFIG_LINUX', if_true: files('export/vhost-user-blk-server.c', '../contrib/libvhost-user/libvhost-user.c'))
854
block_ss.add(when: 'CONFIG_REPLICATION', if_true: files('replication.c'))
855
block_ss.add(when: 'CONFIG_SHEEPDOG', if_true: files('sheepdog.c'))
856
block_ss.add(when: ['CONFIG_LINUX_AIO', libaio], if_true: files('linux-aio.c'))
57
--
857
--
58
2.21.0
858
2.26.2
59
859
60
diff view generated by jsdifflib
New patch
1
Headers used by other subsystems are located in include/. Also add the
2
vhost-user-server and vhost-user-blk-server headers to MAINTAINERS.
1
3
4
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
5
Message-id: 20200924151549.913737-13-stefanha@redhat.com
6
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
7
---
8
MAINTAINERS | 4 +++-
9
{util => include/qemu}/vhost-user-server.h | 0
10
block/export/vhost-user-blk-server.c | 2 +-
11
util/vhost-user-server.c | 2 +-
12
4 files changed, 5 insertions(+), 3 deletions(-)
13
rename {util => include/qemu}/vhost-user-server.h (100%)
14
15
diff --git a/MAINTAINERS b/MAINTAINERS
16
index XXXXXXX..XXXXXXX 100644
17
--- a/MAINTAINERS
18
+++ b/MAINTAINERS
19
@@ -XXX,XX +XXX,XX @@ Vhost-user block device backend server
20
M: Coiby Xu <Coiby.Xu@gmail.com>
21
S: Maintained
22
F: block/export/vhost-user-blk-server.c
23
-F: util/vhost-user-server.c
24
+F: block/export/vhost-user-blk-server.h
25
+F: include/qemu/vhost-user-server.h
26
F: tests/qtest/libqos/vhost-user-blk.c
27
+F: util/vhost-user-server.c
28
29
Replication
30
M: Wen Congyang <wencongyang2@huawei.com>
31
diff --git a/util/vhost-user-server.h b/include/qemu/vhost-user-server.h
32
similarity index 100%
33
rename from util/vhost-user-server.h
34
rename to include/qemu/vhost-user-server.h
35
diff --git a/block/export/vhost-user-blk-server.c b/block/export/vhost-user-blk-server.c
36
index XXXXXXX..XXXXXXX 100644
37
--- a/block/export/vhost-user-blk-server.c
38
+++ b/block/export/vhost-user-blk-server.c
39
@@ -XXX,XX +XXX,XX @@
40
#include "block/block.h"
41
#include "contrib/libvhost-user/libvhost-user.h"
42
#include "standard-headers/linux/virtio_blk.h"
43
-#include "util/vhost-user-server.h"
44
+#include "qemu/vhost-user-server.h"
45
#include "vhost-user-blk-server.h"
46
#include "qapi/error.h"
47
#include "qom/object_interfaces.h"
48
diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
49
index XXXXXXX..XXXXXXX 100644
50
--- a/util/vhost-user-server.c
51
+++ b/util/vhost-user-server.c
52
@@ -XXX,XX +XXX,XX @@
53
*/
54
#include "qemu/osdep.h"
55
#include "qemu/main-loop.h"
56
+#include "qemu/vhost-user-server.h"
57
#include "block/aio-wait.h"
58
-#include "vhost-user-server.h"
59
60
/*
61
* Theory of operation:
62
--
63
2.26.2
64
diff view generated by jsdifflib
1
From: Sam Eiderman <shmuel.eiderman@oracle.com>
1
Don't compile contrib/libvhost-user/libvhost-user.c again. Instead build
2
the static library once and then reuse it throughout QEMU.
2
3
3
In the following case:
4
Also switch from CONFIG_LINUX to CONFIG_VHOST_USER, which is what the
5
vhost-user tools (vhost-user-gpu, etc) do.
4
6
5
(base) A <- B <- C (tip)
7
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
8
Message-id: 20200924151549.913737-14-stefanha@redhat.com
9
[Added CONFIG_LINUX again because libvhost-user doesn't build on macOS.
10
--Stefan]
11
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
12
---
13
block/export/export.c | 8 ++++----
14
block/export/meson.build | 2 +-
15
contrib/libvhost-user/meson.build | 1 +
16
meson.build | 6 +++++-
17
util/meson.build | 4 +++-
18
5 files changed, 14 insertions(+), 7 deletions(-)
6
19
7
when running:
20
diff --git a/block/export/export.c b/block/export/export.c
21
index XXXXXXX..XXXXXXX 100644
22
--- a/block/export/export.c
23
+++ b/block/export/export.c
24
@@ -XXX,XX +XXX,XX @@
25
#include "sysemu/block-backend.h"
26
#include "block/export.h"
27
#include "block/nbd.h"
28
-#if CONFIG_LINUX
29
-#include "block/export/vhost-user-blk-server.h"
30
-#endif
31
#include "qapi/error.h"
32
#include "qapi/qapi-commands-block-export.h"
33
#include "qapi/qapi-events-block-export.h"
34
#include "qemu/id.h"
35
+#ifdef CONFIG_VHOST_USER
36
+#include "vhost-user-blk-server.h"
37
+#endif
38
39
static const BlockExportDriver *blk_exp_drivers[] = {
40
&blk_exp_nbd,
41
-#if CONFIG_LINUX
42
+#ifdef CONFIG_VHOST_USER
43
&blk_exp_vhost_user_blk,
44
#endif
45
};
46
diff --git a/block/export/meson.build b/block/export/meson.build
47
index XXXXXXX..XXXXXXX 100644
48
--- a/block/export/meson.build
49
+++ b/block/export/meson.build
50
@@ -XXX,XX +XXX,XX @@
51
block_ss.add(files('export.c'))
52
-block_ss.add(when: 'CONFIG_LINUX', if_true: files('vhost-user-blk-server.c', '../../contrib/libvhost-user/libvhost-user.c'))
53
+block_ss.add(when: ['CONFIG_LINUX', 'CONFIG_VHOST_USER'], if_true: files('vhost-user-blk-server.c'))
54
diff --git a/contrib/libvhost-user/meson.build b/contrib/libvhost-user/meson.build
55
index XXXXXXX..XXXXXXX 100644
56
--- a/contrib/libvhost-user/meson.build
57
+++ b/contrib/libvhost-user/meson.build
58
@@ -XXX,XX +XXX,XX @@
59
libvhost_user = static_library('vhost-user',
60
files('libvhost-user.c', 'libvhost-user-glib.c'),
61
build_by_default: false)
62
+vhost_user = declare_dependency(link_with: libvhost_user)
63
diff --git a/meson.build b/meson.build
64
index XXXXXXX..XXXXXXX 100644
65
--- a/meson.build
66
+++ b/meson.build
67
@@ -XXX,XX +XXX,XX @@ trace_events_subdirs += [
68
'util',
69
]
70
71
+vhost_user = not_found
72
+if 'CONFIG_VHOST_USER' in config_host
73
+ subdir('contrib/libvhost-user')
74
+endif
75
+
76
subdir('qapi')
77
subdir('qobject')
78
subdir('stubs')
79
@@ -XXX,XX +XXX,XX @@ if have_tools
80
install: true)
81
82
if 'CONFIG_VHOST_USER' in config_host
83
- subdir('contrib/libvhost-user')
84
subdir('contrib/vhost-user-blk')
85
subdir('contrib/vhost-user-gpu')
86
subdir('contrib/vhost-user-input')
87
diff --git a/util/meson.build b/util/meson.build
88
index XXXXXXX..XXXXXXX 100644
89
--- a/util/meson.build
90
+++ b/util/meson.build
91
@@ -XXX,XX +XXX,XX @@ if have_block
92
util_ss.add(files('main-loop.c'))
93
util_ss.add(files('nvdimm-utils.c'))
94
util_ss.add(files('qemu-coroutine.c', 'qemu-coroutine-lock.c', 'qemu-coroutine-io.c'))
95
- util_ss.add(when: 'CONFIG_LINUX', if_true: files('vhost-user-server.c'))
96
+ util_ss.add(when: ['CONFIG_LINUX', 'CONFIG_VHOST_USER'], if_true: [
97
+ files('vhost-user-server.c'), vhost_user
98
+ ])
99
util_ss.add(files('block-helpers.c'))
100
util_ss.add(files('qemu-coroutine-sleep.c'))
101
util_ss.add(files('qemu-co-shared-resource.c'))
102
--
103
2.26.2
8
104
9
qemu-img rebase -b A C
10
11
QEMU would read all sectors not allocated in the file being rebased (C)
12
and compare them to the new base image (A), regardless of whether they
13
were changed or even allocated anywhere along the chain between the new
14
base and the top image (B). This causes many unneeded reads when
15
rebasing an image which represents a small diff of a large disk, as it
16
would read most of the disk's sectors.
17
18
Instead, use bdrv_is_allocated_above() to reduce the number of
19
unnecessary reads.
20
21
Reviewed-by: Karl Heubaum <karl.heubaum@oracle.com>
22
Signed-off-by: Sam Eiderman <shmuel.eiderman@oracle.com>
23
Signed-off-by: Eyal Moscovici <eyal.moscovici@oracle.com>
24
Message-id: 20190523163337.4497-3-shmuel.eiderman@oracle.com
25
Signed-off-by: Max Reitz <mreitz@redhat.com>
26
---
27
qemu-img.c | 25 ++++++++++++++++++++++++-
28
1 file changed, 24 insertions(+), 1 deletion(-)
29
30
diff --git a/qemu-img.c b/qemu-img.c
31
index XXXXXXX..XXXXXXX 100644
32
--- a/qemu-img.c
33
+++ b/qemu-img.c
34
@@ -XXX,XX +XXX,XX @@ static int img_rebase(int argc, char **argv)
35
BlockBackend *blk = NULL, *blk_old_backing = NULL, *blk_new_backing = NULL;
36
uint8_t *buf_old = NULL;
37
uint8_t *buf_new = NULL;
38
- BlockDriverState *bs = NULL;
39
+ BlockDriverState *bs = NULL, *prefix_chain_bs = NULL;
40
char *filename;
41
const char *fmt, *cache, *src_cache, *out_basefmt, *out_baseimg;
42
int c, flags, src_flags, ret;
43
@@ -XXX,XX +XXX,XX @@ static int img_rebase(int argc, char **argv)
44
goto out;
45
}
46
47
+ /*
48
+ * Find out whether we rebase an image on top of a previous image
49
+ * in its chain.
50
+ */
51
+ prefix_chain_bs = bdrv_find_backing_image(bs, out_real_path);
52
+
53
blk_new_backing = blk_new_open(out_real_path, NULL,
54
options, src_flags, &local_err);
55
g_free(out_real_path);
56
@@ -XXX,XX +XXX,XX @@ static int img_rebase(int argc, char **argv)
57
continue;
58
}
59
60
+ if (prefix_chain_bs) {
61
+ /*
62
+ * If cluster wasn't changed since prefix_chain, we don't need
63
+ * to take action
64
+ */
65
+ ret = bdrv_is_allocated_above(backing_bs(bs), prefix_chain_bs,
66
+ offset, n, &n);
67
+ if (ret < 0) {
68
+ error_report("error while reading image metadata: %s",
69
+ strerror(-ret));
70
+ goto out;
71
+ }
72
+ if (!ret) {
73
+ continue;
74
+ }
75
+ }
76
+
77
/*
78
* Read old and new backing file and take into consideration that
79
* backing files may be smaller than the COW image.
80
--
81
2.21.0
82
83
diff view generated by jsdifflib
1
From: Sam Eiderman <shmuel.eiderman@oracle.com>
1
Introduce libblkdev.fa to avoid recompiling blockdev_ss twice.
2
2
3
In safe mode we open the entire chain, including the parent backing
3
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
4
file of the rebased file.
4
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
5
Do not open a new BlockBackend for the parent backing file, which
5
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
6
saves opening the rest of the chain twice, which for long chains
6
Message-id: 20200929125516.186715-3-stefanha@redhat.com
7
saves many "pricy" bdrv_open() calls.
7
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
8
---
9
meson.build | 12 ++++++++++--
10
storage-daemon/meson.build | 3 +--
11
2 files changed, 11 insertions(+), 4 deletions(-)
8
12
9
Permissions for blk_new() were copied from blk_new_open() when
13
diff --git a/meson.build b/meson.build
10
flags = 0.
14
index XXXXXXX..XXXXXXX 100644
15
--- a/meson.build
16
+++ b/meson.build
17
@@ -XXX,XX +XXX,XX @@ blockdev_ss.add(files(
18
# os-win32.c does not
19
blockdev_ss.add(when: 'CONFIG_POSIX', if_true: files('os-posix.c'))
20
softmmu_ss.add(when: 'CONFIG_WIN32', if_true: [files('os-win32.c')])
21
-softmmu_ss.add_all(blockdev_ss)
22
23
common_ss.add(files('cpus-common.c'))
24
25
@@ -XXX,XX +XXX,XX @@ block = declare_dependency(link_whole: [libblock],
26
link_args: '@block.syms',
27
dependencies: [crypto, io])
28
29
+blockdev_ss = blockdev_ss.apply(config_host, strict: false)
30
+libblockdev = static_library('blockdev', blockdev_ss.sources() + genh,
31
+ dependencies: blockdev_ss.dependencies(),
32
+ name_suffix: 'fa',
33
+ build_by_default: false)
34
+
35
+blockdev = declare_dependency(link_whole: [libblockdev],
36
+ dependencies: [block])
37
+
38
qmp_ss = qmp_ss.apply(config_host, strict: false)
39
libqmp = static_library('qmp', qmp_ss.sources() + genh,
40
dependencies: qmp_ss.dependencies(),
41
@@ -XXX,XX +XXX,XX @@ foreach m : block_mods + softmmu_mods
42
install_dir: config_host['qemu_moddir'])
43
endforeach
44
45
-softmmu_ss.add(authz, block, chardev, crypto, io, qmp)
46
+softmmu_ss.add(authz, blockdev, chardev, crypto, io, qmp)
47
common_ss.add(qom, qemuutil)
48
49
common_ss.add_all(when: 'CONFIG_SOFTMMU', if_true: [softmmu_ss])
50
diff --git a/storage-daemon/meson.build b/storage-daemon/meson.build
51
index XXXXXXX..XXXXXXX 100644
52
--- a/storage-daemon/meson.build
53
+++ b/storage-daemon/meson.build
54
@@ -XXX,XX +XXX,XX @@
55
qsd_ss = ss.source_set()
56
qsd_ss.add(files('qemu-storage-daemon.c'))
57
-qsd_ss.add(block, chardev, qmp, qom, qemuutil)
58
-qsd_ss.add_all(blockdev_ss)
59
+qsd_ss.add(blockdev, chardev, qmp, qom, qemuutil)
60
61
subdir('qapi')
62
63
--
64
2.26.2
11
65
12
Reviewed-by: Karl Heubaum <karl.heubaum@oracle.com>
13
Reviewed-by: Eyal Moscovici <eyal.moscovici@oracle.com>
14
Signed-off-by: Sagi Amit <sagi.amit@oracle.com>
15
Co-developed-by: Sagi Amit <sagi.amit@oracle.com>
16
Signed-off-by: Sam Eiderman <shmuel.eiderman@oracle.com>
17
Message-id: 20190523163337.4497-2-shmuel.eiderman@oracle.com
18
Signed-off-by: Max Reitz <mreitz@redhat.com>
19
---
20
qemu-img.c | 29 +++++++++--------------------
21
1 file changed, 9 insertions(+), 20 deletions(-)
22
23
diff --git a/qemu-img.c b/qemu-img.c
24
index XXXXXXX..XXXXXXX 100644
25
--- a/qemu-img.c
26
+++ b/qemu-img.c
27
@@ -XXX,XX +XXX,XX @@ static int img_rebase(int argc, char **argv)
28
29
/* For safe rebasing we need to compare old and new backing file */
30
if (!unsafe) {
31
- char backing_name[PATH_MAX];
32
QDict *options = NULL;
33
+ BlockDriverState *base_bs = backing_bs(bs);
34
35
- if (bs->backing) {
36
- if (bs->backing_format[0] != '\0') {
37
- options = qdict_new();
38
- qdict_put_str(options, "driver", bs->backing_format);
39
- }
40
-
41
- if (force_share) {
42
- if (!options) {
43
- options = qdict_new();
44
- }
45
- qdict_put_bool(options, BDRV_OPT_FORCE_SHARE, true);
46
- }
47
- bdrv_get_backing_filename(bs, backing_name, sizeof(backing_name));
48
- blk_old_backing = blk_new_open(backing_name, NULL,
49
- options, src_flags, &local_err);
50
- if (!blk_old_backing) {
51
+ if (base_bs) {
52
+ blk_old_backing = blk_new(BLK_PERM_CONSISTENT_READ,
53
+ BLK_PERM_ALL);
54
+ ret = blk_insert_bs(blk_old_backing, base_bs,
55
+ &local_err);
56
+ if (ret < 0) {
57
error_reportf_err(local_err,
58
- "Could not open old backing file '%s': ",
59
- backing_name);
60
- ret = -1;
61
+ "Could not reuse old backing file '%s': ",
62
+ base_bs->filename);
63
goto out;
64
}
65
} else {
66
--
67
2.21.0
68
69
diff view generated by jsdifflib
1
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
1
Block exports are used by softmmu, qemu-storage-daemon, and qemu-nbd.
2
They are not used by other programs and are not otherwise needed in
3
libblock.
2
4
3
qcow2.h depends on block_int.h. Compilation isn't broken currently only
5
Undo the recent move of blockdev-nbd.c from blockdev_ss into block_ss.
4
due to block_int.h always included before qcow2.h. Though, it seems
6
Since bdrv_close_all() (libblock) calls blk_exp_close_all()
5
better to directly include block_int.h in qcow2.h.
7
(libblockdev) a stub function is required..
6
8
7
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
9
Make qemu-nbd.c use signal handling utility functions instead of
8
Reviewed-by: Alberto Garcia <berto@igalia.com>
10
duplicating the code. This helps because os-posix.c is in libblockdev
9
Reviewed-by: Max Reitz <mreitz@redhat.com>
11
and it depends on a qemu_system_killed() symbol that qemu-nbd.c lacks.
10
Message-id: 20190506142741.41731-2-vsementsov@virtuozzo.com
12
Once we use the signal handling utility functions we also end up
11
Signed-off-by: Max Reitz <mreitz@redhat.com>
13
providing the necessary symbol.
14
15
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
16
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
17
Reviewed-by: Eric Blake <eblake@redhat.com>
18
Message-id: 20200929125516.186715-4-stefanha@redhat.com
19
[Fixed s/ndb/nbd/ typo in commit description as suggested by Eric Blake
20
--Stefan]
21
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
12
---
22
---
13
block/qcow2.h | 1 +
23
qemu-nbd.c | 21 ++++++++-------------
14
block/qcow2-bitmap.c | 1 -
24
stubs/blk-exp-close-all.c | 7 +++++++
15
block/qcow2-cache.c | 1 -
25
block/export/meson.build | 4 ++--
16
block/qcow2-cluster.c | 1 -
26
meson.build | 4 ++--
17
block/qcow2-refcount.c | 1 -
27
nbd/meson.build | 2 ++
18
block/qcow2-snapshot.c | 1 -
28
stubs/meson.build | 1 +
19
block/qcow2.c | 1 -
29
6 files changed, 22 insertions(+), 17 deletions(-)
20
7 files changed, 1 insertion(+), 6 deletions(-)
30
create mode 100644 stubs/blk-exp-close-all.c
21
31
22
diff --git a/block/qcow2.h b/block/qcow2.h
32
diff --git a/qemu-nbd.c b/qemu-nbd.c
23
index XXXXXXX..XXXXXXX 100644
33
index XXXXXXX..XXXXXXX 100644
24
--- a/block/qcow2.h
34
--- a/qemu-nbd.c
25
+++ b/block/qcow2.h
35
+++ b/qemu-nbd.c
26
@@ -XXX,XX +XXX,XX @@
27
#include "crypto/block.h"
28
#include "qemu/coroutine.h"
29
#include "qemu/units.h"
30
+#include "block/block_int.h"
31
32
//#define DEBUG_ALLOC
33
//#define DEBUG_ALLOC2
34
diff --git a/block/qcow2-bitmap.c b/block/qcow2-bitmap.c
35
index XXXXXXX..XXXXXXX 100644
36
--- a/block/qcow2-bitmap.c
37
+++ b/block/qcow2-bitmap.c
38
@@ -XXX,XX +XXX,XX @@
36
@@ -XXX,XX +XXX,XX @@
39
#include "qapi/error.h"
37
#include "qapi/error.h"
40
#include "qemu/cutils.h"
38
#include "qemu/cutils.h"
41
39
#include "sysemu/block-backend.h"
42
-#include "block/block_int.h"
40
+#include "sysemu/runstate.h" /* for qemu_system_killed() prototype */
43
#include "qcow2.h"
41
#include "block/block_int.h"
44
42
#include "block/nbd.h"
45
/* NOTICE: BME here means Bitmaps Extension and used as a namespace for
43
#include "qemu/main-loop.h"
46
diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c
44
@@ -XXX,XX +XXX,XX @@ QEMU_COPYRIGHT "\n"
45
}
46
47
#ifdef CONFIG_POSIX
48
-static void termsig_handler(int signum)
49
+/*
50
+ * The client thread uses SIGTERM to interrupt the server. A signal
51
+ * handler ensures that "qemu-nbd -v -c" exits with a nice status code.
52
+ */
53
+void qemu_system_killed(int signum, pid_t pid)
54
{
55
qatomic_cmpxchg(&state, RUNNING, TERMINATE);
56
qemu_notify_event();
57
@@ -XXX,XX +XXX,XX @@ int main(int argc, char **argv)
58
BlockExportOptions *export_opts;
59
60
#ifdef CONFIG_POSIX
61
- /*
62
- * Exit gracefully on various signals, which includes SIGTERM used
63
- * by 'qemu-nbd -v -c'.
64
- */
65
- struct sigaction sa_sigterm;
66
- memset(&sa_sigterm, 0, sizeof(sa_sigterm));
67
- sa_sigterm.sa_handler = termsig_handler;
68
- sigaction(SIGTERM, &sa_sigterm, NULL);
69
- sigaction(SIGINT, &sa_sigterm, NULL);
70
- sigaction(SIGHUP, &sa_sigterm, NULL);
71
-
72
- signal(SIGPIPE, SIG_IGN);
73
+ os_setup_early_signal_handling();
74
+ os_setup_signal_handling();
75
#endif
76
77
socket_init();
78
diff --git a/stubs/blk-exp-close-all.c b/stubs/blk-exp-close-all.c
79
new file mode 100644
80
index XXXXXXX..XXXXXXX
81
--- /dev/null
82
+++ b/stubs/blk-exp-close-all.c
83
@@ -XXX,XX +XXX,XX @@
84
+#include "qemu/osdep.h"
85
+#include "block/export.h"
86
+
87
+/* Only used in programs that support block exports (libblockdev.fa) */
88
+void blk_exp_close_all(void)
89
+{
90
+}
91
diff --git a/block/export/meson.build b/block/export/meson.build
47
index XXXXXXX..XXXXXXX 100644
92
index XXXXXXX..XXXXXXX 100644
48
--- a/block/qcow2-cache.c
93
--- a/block/export/meson.build
49
+++ b/block/qcow2-cache.c
94
+++ b/block/export/meson.build
50
@@ -XXX,XX +XXX,XX @@
95
@@ -XXX,XX +XXX,XX @@
51
*/
96
-block_ss.add(files('export.c'))
52
97
-block_ss.add(when: ['CONFIG_LINUX', 'CONFIG_VHOST_USER'], if_true: files('vhost-user-blk-server.c'))
53
#include "qemu/osdep.h"
98
+blockdev_ss.add(files('export.c'))
54
-#include "block/block_int.h"
99
+blockdev_ss.add(when: ['CONFIG_LINUX', 'CONFIG_VHOST_USER'], if_true: files('vhost-user-blk-server.c'))
55
#include "qemu-common.h"
100
diff --git a/meson.build b/meson.build
56
#include "qcow2.h"
57
#include "trace.h"
58
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
59
index XXXXXXX..XXXXXXX 100644
101
index XXXXXXX..XXXXXXX 100644
60
--- a/block/qcow2-cluster.c
102
--- a/meson.build
61
+++ b/block/qcow2-cluster.c
103
+++ b/meson.build
104
@@ -XXX,XX +XXX,XX @@ subdir('dump')
105
106
block_ss.add(files(
107
'block.c',
108
- 'blockdev-nbd.c',
109
'blockjob.c',
110
'job.c',
111
'qemu-io-cmds.c',
112
@@ -XXX,XX +XXX,XX @@ subdir('block')
113
114
blockdev_ss.add(files(
115
'blockdev.c',
116
+ 'blockdev-nbd.c',
117
'iothread.c',
118
'job-qmp.c',
119
))
120
@@ -XXX,XX +XXX,XX @@ if have_tools
121
qemu_io = executable('qemu-io', files('qemu-io.c'),
122
dependencies: [block, qemuutil], install: true)
123
qemu_nbd = executable('qemu-nbd', files('qemu-nbd.c'),
124
- dependencies: [block, qemuutil], install: true)
125
+ dependencies: [blockdev, qemuutil], install: true)
126
127
subdir('storage-daemon')
128
subdir('contrib/rdmacm-mux')
129
diff --git a/nbd/meson.build b/nbd/meson.build
130
index XXXXXXX..XXXXXXX 100644
131
--- a/nbd/meson.build
132
+++ b/nbd/meson.build
62
@@ -XXX,XX +XXX,XX @@
133
@@ -XXX,XX +XXX,XX @@
63
134
block_ss.add(files(
64
#include "qapi/error.h"
135
'client.c',
65
#include "qemu-common.h"
136
'common.c',
66
-#include "block/block_int.h"
137
+))
67
#include "qcow2.h"
138
+blockdev_ss.add(files(
68
#include "qemu/bswap.h"
139
'server.c',
69
#include "trace.h"
140
))
70
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
141
diff --git a/stubs/meson.build b/stubs/meson.build
71
index XXXXXXX..XXXXXXX 100644
142
index XXXXXXX..XXXXXXX 100644
72
--- a/block/qcow2-refcount.c
143
--- a/stubs/meson.build
73
+++ b/block/qcow2-refcount.c
144
+++ b/stubs/meson.build
74
@@ -XXX,XX +XXX,XX @@
145
@@ -XXX,XX +XXX,XX @@
75
#include "qemu/osdep.h"
146
stub_ss.add(files('arch_type.c'))
76
#include "qapi/error.h"
147
stub_ss.add(files('bdrv-next-monitor-owned.c'))
77
#include "qemu-common.h"
148
stub_ss.add(files('blk-commit-all.c'))
78
-#include "block/block_int.h"
149
+stub_ss.add(files('blk-exp-close-all.c'))
79
#include "qcow2.h"
150
stub_ss.add(files('blockdev-close-all-bdrv-states.c'))
80
#include "qemu/range.h"
151
stub_ss.add(files('change-state-handler.c'))
81
#include "qemu/bswap.h"
152
stub_ss.add(files('cmos.c'))
82
diff --git a/block/qcow2-snapshot.c b/block/qcow2-snapshot.c
83
index XXXXXXX..XXXXXXX 100644
84
--- a/block/qcow2-snapshot.c
85
+++ b/block/qcow2-snapshot.c
86
@@ -XXX,XX +XXX,XX @@
87
88
#include "qemu/osdep.h"
89
#include "qapi/error.h"
90
-#include "block/block_int.h"
91
#include "qcow2.h"
92
#include "qemu/bswap.h"
93
#include "qemu/error-report.h"
94
diff --git a/block/qcow2.c b/block/qcow2.c
95
index XXXXXXX..XXXXXXX 100644
96
--- a/block/qcow2.c
97
+++ b/block/qcow2.c
98
@@ -XXX,XX +XXX,XX @@
99
#define ZLIB_CONST
100
#include <zlib.h>
101
102
-#include "block/block_int.h"
103
#include "block/qdict.h"
104
#include "sysemu/block-backend.h"
105
#include "qemu/module.h"
106
--
153
--
107
2.21.0
154
2.26.2
108
155
109
diff view generated by jsdifflib
1
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
1
Make it possible to specify the iothread where the export will run. By
2
default the block node can be moved to other AioContexts later and the
3
export will follow. The fixed-iothread option forces strict behavior
4
that prevents changing AioContext while the export is active. See the
5
QAPI docs for details.
2
6
3
Split out cluster_size calculation. Move copy-bitmap creation above
7
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
4
block-job creation, as we are going to share it with upcoming
8
Message-id: 20200929125516.186715-5-stefanha@redhat.com
5
backup-top filter, which also should be created before actual block job
9
[Fix stray '#' character in block-export.json and add missing "(since:
6
creation.
10
5.2)" as suggested by Eric Blake.
11
--Stefan]
12
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
13
---
14
qapi/block-export.json | 11 ++++++++++
15
block/export/export.c | 31 +++++++++++++++++++++++++++-
16
block/export/vhost-user-blk-server.c | 5 ++++-
17
nbd/server.c | 2 --
18
4 files changed, 45 insertions(+), 4 deletions(-)
7
19
8
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
20
diff --git a/qapi/block-export.json b/qapi/block-export.json
9
Message-id: 20190429090842.57910-6-vsementsov@virtuozzo.com
10
[mreitz: Dropped a paragraph from the commit message that was left over
11
from a previous version]
12
Signed-off-by: Max Reitz <mreitz@redhat.com>
13
---
14
block/backup.c | 82 ++++++++++++++++++++++++++++++++------------------
15
1 file changed, 52 insertions(+), 30 deletions(-)
16
17
diff --git a/block/backup.c b/block/backup.c
18
index XXXXXXX..XXXXXXX 100644
21
index XXXXXXX..XXXXXXX 100644
19
--- a/block/backup.c
22
--- a/qapi/block-export.json
20
+++ b/block/backup.c
23
+++ b/qapi/block-export.json
21
@@ -XXX,XX +XXX,XX @@ static const BlockJobDriver backup_job_driver = {
24
@@ -XXX,XX +XXX,XX @@
22
.drain = backup_drain,
25
# export before completion is signalled. (since: 5.2;
23
};
26
# default: false)
24
27
#
25
+static int64_t backup_calculate_cluster_size(BlockDriverState *target,
28
+# @iothread: The name of the iothread object where the export will run. The
26
+ Error **errp)
29
+# default is to use the thread currently associated with the
27
+{
30
+# block node. (since: 5.2)
28
+ int ret;
31
+#
29
+ BlockDriverInfo bdi;
32
+# @fixed-iothread: True prevents the block node from being moved to another
33
+# thread while the export is active. If true and @iothread is
34
+# given, export creation fails if the block node cannot be
35
+# moved to the iothread. The default is false. (since: 5.2)
36
+#
37
# Since: 4.2
38
##
39
{ 'union': 'BlockExportOptions',
40
'base': { 'type': 'BlockExportType',
41
'id': 'str',
42
+     '*fixed-iothread': 'bool',
43
+     '*iothread': 'str',
44
'node-name': 'str',
45
'*writable': 'bool',
46
'*writethrough': 'bool' },
47
diff --git a/block/export/export.c b/block/export/export.c
48
index XXXXXXX..XXXXXXX 100644
49
--- a/block/export/export.c
50
+++ b/block/export/export.c
51
@@ -XXX,XX +XXX,XX @@
52
53
#include "block/block.h"
54
#include "sysemu/block-backend.h"
55
+#include "sysemu/iothread.h"
56
#include "block/export.h"
57
#include "block/nbd.h"
58
#include "qapi/error.h"
59
@@ -XXX,XX +XXX,XX @@ static const BlockExportDriver *blk_exp_find_driver(BlockExportType type)
60
61
BlockExport *blk_exp_add(BlockExportOptions *export, Error **errp)
62
{
63
+ bool fixed_iothread = export->has_fixed_iothread && export->fixed_iothread;
64
const BlockExportDriver *drv;
65
BlockExport *exp = NULL;
66
BlockDriverState *bs;
67
- BlockBackend *blk;
68
+ BlockBackend *blk = NULL;
69
AioContext *ctx;
70
uint64_t perm;
71
int ret;
72
@@ -XXX,XX +XXX,XX @@ BlockExport *blk_exp_add(BlockExportOptions *export, Error **errp)
73
ctx = bdrv_get_aio_context(bs);
74
aio_context_acquire(ctx);
75
76
+ if (export->has_iothread) {
77
+ IOThread *iothread;
78
+ AioContext *new_ctx;
30
+
79
+
31
+ /*
80
+ iothread = iothread_by_id(export->iothread);
32
+ * If there is no backing file on the target, we cannot rely on COW if our
81
+ if (!iothread) {
33
+ * backup cluster size is smaller than the target cluster size. Even for
82
+ error_setg(errp, "iothread \"%s\" not found", export->iothread);
34
+ * targets with a backing file, try to avoid COW if possible.
83
+ goto fail;
35
+ */
84
+ }
36
+ ret = bdrv_get_info(target, &bdi);
85
+
37
+ if (ret == -ENOTSUP && !target->backing) {
86
+ new_ctx = iothread_get_aio_context(iothread);
38
+ /* Cluster size is not defined */
87
+
39
+ warn_report("The target block device doesn't provide "
88
+ ret = bdrv_try_set_aio_context(bs, new_ctx, errp);
40
+ "information about the block size and it doesn't have a "
89
+ if (ret == 0) {
41
+ "backing file. The default block size of %u bytes is "
90
+ aio_context_release(ctx);
42
+ "used. If the actual block size of the target exceeds "
91
+ aio_context_acquire(new_ctx);
43
+ "this default, the backup may be unusable",
92
+ ctx = new_ctx;
44
+ BACKUP_CLUSTER_SIZE_DEFAULT);
93
+ } else if (fixed_iothread) {
45
+ return BACKUP_CLUSTER_SIZE_DEFAULT;
94
+ goto fail;
46
+ } else if (ret < 0 && !target->backing) {
95
+ }
47
+ error_setg_errno(errp, -ret,
48
+ "Couldn't determine the cluster size of the target image, "
49
+ "which has no backing file");
50
+ error_append_hint(errp,
51
+ "Aborting, since this may create an unusable destination image\n");
52
+ return ret;
53
+ } else if (ret < 0 && target->backing) {
54
+ /* Not fatal; just trudge on ahead. */
55
+ return BACKUP_CLUSTER_SIZE_DEFAULT;
56
+ }
96
+ }
57
+
97
+
58
+ return MAX(BACKUP_CLUSTER_SIZE_DEFAULT, bdi.cluster_size);
98
/*
59
+}
99
* Block exports are used for non-shared storage migration. Make sure
100
* that BDRV_O_INACTIVE is cleared and the image is ready for write
101
@@ -XXX,XX +XXX,XX @@ BlockExport *blk_exp_add(BlockExportOptions *export, Error **errp)
102
}
103
104
blk = blk_new(ctx, perm, BLK_PERM_ALL);
60
+
105
+
61
BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
106
+ if (!fixed_iothread) {
62
BlockDriverState *target, int64_t speed,
107
+ blk_set_allow_aio_context_change(blk, true);
63
MirrorSyncMode sync_mode, BdrvDirtyBitmap *sync_bitmap,
64
@@ -XXX,XX +XXX,XX @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
65
JobTxn *txn, Error **errp)
66
{
67
int64_t len;
68
- BlockDriverInfo bdi;
69
BackupBlockJob *job = NULL;
70
int ret;
71
+ int64_t cluster_size;
72
+ HBitmap *copy_bitmap = NULL;
73
74
assert(bs);
75
assert(target);
76
@@ -XXX,XX +XXX,XX @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
77
goto error;
78
}
79
80
+ cluster_size = backup_calculate_cluster_size(target, errp);
81
+ if (cluster_size < 0) {
82
+ goto error;
83
+ }
108
+ }
84
+
109
+
85
+ copy_bitmap = hbitmap_alloc(len, ctz32(cluster_size));
110
ret = blk_insert_bs(blk, bs, errp);
111
if (ret < 0) {
112
goto fail;
113
diff --git a/block/export/vhost-user-blk-server.c b/block/export/vhost-user-blk-server.c
114
index XXXXXXX..XXXXXXX 100644
115
--- a/block/export/vhost-user-blk-server.c
116
+++ b/block/export/vhost-user-blk-server.c
117
@@ -XXX,XX +XXX,XX @@ static const VuDevIface vu_blk_iface = {
118
static void blk_aio_attached(AioContext *ctx, void *opaque)
119
{
120
VuBlkExport *vexp = opaque;
86
+
121
+
87
/* job->len is fixed, so we can't allow resize */
122
+ vexp->export.ctx = ctx;
88
job = block_job_create(job_id, &backup_job_driver, txn, bs,
123
vhost_user_server_attach_aio_context(&vexp->vu_server, ctx);
89
BLK_PERM_CONSISTENT_READ,
124
}
90
@@ -XXX,XX +XXX,XX @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
125
91
126
static void blk_aio_detach(void *opaque)
92
/* Detect image-fleecing (and similar) schemes */
127
{
93
job->serialize_target_writes = bdrv_chain_contains(target, bs);
128
VuBlkExport *vexp = opaque;
129
+
130
vhost_user_server_detach_aio_context(&vexp->vu_server);
131
+ vexp->export.ctx = NULL;
132
}
133
134
static void
135
@@ -XXX,XX +XXX,XX @@ static int vu_blk_exp_create(BlockExport *exp, BlockExportOptions *opts,
136
vu_blk_initialize_config(blk_bs(exp->blk), &vexp->blkcfg,
137
logical_block_size);
138
139
- blk_set_allow_aio_context_change(exp->blk, true);
140
blk_add_aio_context_notifier(exp->blk, blk_aio_attached, blk_aio_detach,
141
vexp);
142
143
diff --git a/nbd/server.c b/nbd/server.c
144
index XXXXXXX..XXXXXXX 100644
145
--- a/nbd/server.c
146
+++ b/nbd/server.c
147
@@ -XXX,XX +XXX,XX @@ static int nbd_export_create(BlockExport *blk_exp, BlockExportOptions *exp_args,
148
return ret;
149
}
150
151
- blk_set_allow_aio_context_change(blk, true);
94
-
152
-
95
- /* If there is no backing file on the target, we cannot rely on COW if our
153
QTAILQ_INIT(&exp->clients);
96
- * backup cluster size is smaller than the target cluster size. Even for
154
exp->name = g_strdup(arg->name);
97
- * targets with a backing file, try to avoid COW if possible. */
155
exp->description = g_strdup(arg->description);
98
- ret = bdrv_get_info(target, &bdi);
99
- if (ret == -ENOTSUP && !target->backing) {
100
- /* Cluster size is not defined */
101
- warn_report("The target block device doesn't provide "
102
- "information about the block size and it doesn't have a "
103
- "backing file. The default block size of %u bytes is "
104
- "used. If the actual block size of the target exceeds "
105
- "this default, the backup may be unusable",
106
- BACKUP_CLUSTER_SIZE_DEFAULT);
107
- job->cluster_size = BACKUP_CLUSTER_SIZE_DEFAULT;
108
- } else if (ret < 0 && !target->backing) {
109
- error_setg_errno(errp, -ret,
110
- "Couldn't determine the cluster size of the target image, "
111
- "which has no backing file");
112
- error_append_hint(errp,
113
- "Aborting, since this may create an unusable destination image\n");
114
- goto error;
115
- } else if (ret < 0 && target->backing) {
116
- /* Not fatal; just trudge on ahead. */
117
- job->cluster_size = BACKUP_CLUSTER_SIZE_DEFAULT;
118
- } else {
119
- job->cluster_size = MAX(BACKUP_CLUSTER_SIZE_DEFAULT, bdi.cluster_size);
120
- }
121
-
122
- job->copy_bitmap = hbitmap_alloc(len, ctz32(job->cluster_size));
123
+ job->cluster_size = cluster_size;
124
+ job->copy_bitmap = copy_bitmap;
125
+ copy_bitmap = NULL;
126
job->use_copy_range = true;
127
job->copy_range_size = MIN_NON_ZERO(blk_get_max_transfer(job->common.blk),
128
blk_get_max_transfer(job->target));
129
@@ -XXX,XX +XXX,XX @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
130
return &job->common;
131
132
error:
133
+ if (copy_bitmap) {
134
+ assert(!job || !job->copy_bitmap);
135
+ hbitmap_free(copy_bitmap);
136
+ }
137
if (sync_bitmap) {
138
bdrv_reclaim_dirty_bitmap(bs, sync_bitmap, NULL);
139
}
140
--
156
--
141
2.21.0
157
2.26.2
142
158
143
diff view generated by jsdifflib
1
From: John Snow <jsnow@redhat.com>
1
Allow the number of queues to be configured using --export
2
vhost-user-blk,num-queues=N. This setting should match the QEMU --device
3
vhost-user-blk-pci,num-queues=N setting but QEMU vhost-user-blk.c lowers
4
its own value if the vhost-user-blk backend offers fewer queues than
5
QEMU.
2
6
3
We mandate that the source node must be a root node; but there's no reason
7
The vhost-user-blk-server.c code is already capable of multi-queue. All
4
I am aware of that it needs to be restricted to such. In some cases, we need
8
virtqueue processing runs in the same AioContext. No new locking is
5
to make sure that there's a medium present, but in the general case we can
9
needed.
6
allow the backup job itself to do the graph checking.
7
10
8
This patch helps improve the error message when you try to backup from
11
Add the num-queues=N option and set the VIRTIO_BLK_F_MQ feature bit.
9
the same node more than once, which is reflected in the change to test
12
Note that the feature bit only announces the presence of the num_queues
10
056.
13
configuration space field. It does not promise that there is more than 1
14
virtqueue, so we can set it unconditionally.
11
15
12
For backups with bitmaps, it will also show a better error message that
16
I tested multi-queue by running a random read fio test with numjobs=4 on
13
the bitmap is in use instead of giving you something cryptic like "need
17
an -smp 4 guest. After the benchmark finished the guest /proc/interrupts
14
a root node."
18
file showed activity on all 4 virtio-blk MSI-X. The /sys/block/vda/mq/
19
directory shows that Linux blk-mq has 4 queues configured.
15
20
16
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1707303
21
An automated test is included in the next commit.
17
Signed-off-by: John Snow <jsnow@redhat.com>
22
18
Message-id: 20190521210053.8864-1-jsnow@redhat.com
23
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
19
Signed-off-by: Max Reitz <mreitz@redhat.com>
24
Acked-by: Markus Armbruster <armbru@redhat.com>
25
Message-id: 20201001144604.559733-2-stefanha@redhat.com
26
[Fixed accidental tab characters as suggested by Markus Armbruster
27
--Stefan]
28
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
20
---
29
---
21
blockdev.c | 7 ++++++-
30
qapi/block-export.json | 10 +++++++---
22
tests/qemu-iotests/056 | 2 +-
31
block/export/vhost-user-blk-server.c | 24 ++++++++++++++++++------
23
2 files changed, 7 insertions(+), 2 deletions(-)
32
2 files changed, 25 insertions(+), 9 deletions(-)
24
33
25
diff --git a/blockdev.c b/blockdev.c
34
diff --git a/qapi/block-export.json b/qapi/block-export.json
26
index XXXXXXX..XXXXXXX 100644
35
index XXXXXXX..XXXXXXX 100644
27
--- a/blockdev.c
36
--- a/qapi/block-export.json
28
+++ b/blockdev.c
37
+++ b/qapi/block-export.json
29
@@ -XXX,XX +XXX,XX @@ static BlockJob *do_drive_backup(DriveBackup *backup, JobTxn *txn,
38
@@ -XXX,XX +XXX,XX @@
30
backup->compress = false;
39
# SocketAddress types are supported. Passed fds must be UNIX domain
40
# sockets.
41
# @logical-block-size: Logical block size in bytes. Defaults to 512 bytes.
42
+# @num-queues: Number of request virtqueues. Must be greater than 0. Defaults
43
+# to 1.
44
#
45
# Since: 5.2
46
##
47
{ 'struct': 'BlockExportOptionsVhostUserBlk',
48
- 'data': { 'addr': 'SocketAddress', '*logical-block-size': 'size' } }
49
+ 'data': { 'addr': 'SocketAddress',
50
+     '*logical-block-size': 'size',
51
+ '*num-queues': 'uint16'} }
52
53
##
54
# @NbdServerAddOptions:
55
@@ -XXX,XX +XXX,XX @@
56
{ 'union': 'BlockExportOptions',
57
'base': { 'type': 'BlockExportType',
58
'id': 'str',
59
-     '*fixed-iothread': 'bool',
60
-     '*iothread': 'str',
61
+ '*fixed-iothread': 'bool',
62
+ '*iothread': 'str',
63
'node-name': 'str',
64
'*writable': 'bool',
65
'*writethrough': 'bool' },
66
diff --git a/block/export/vhost-user-blk-server.c b/block/export/vhost-user-blk-server.c
67
index XXXXXXX..XXXXXXX 100644
68
--- a/block/export/vhost-user-blk-server.c
69
+++ b/block/export/vhost-user-blk-server.c
70
@@ -XXX,XX +XXX,XX @@
71
#include "util/block-helpers.h"
72
73
enum {
74
- VHOST_USER_BLK_MAX_QUEUES = 1,
75
+ VHOST_USER_BLK_NUM_QUEUES_DEFAULT = 1,
76
};
77
struct virtio_blk_inhdr {
78
unsigned char status;
79
@@ -XXX,XX +XXX,XX @@ static uint64_t vu_blk_get_features(VuDev *dev)
80
1ull << VIRTIO_BLK_F_DISCARD |
81
1ull << VIRTIO_BLK_F_WRITE_ZEROES |
82
1ull << VIRTIO_BLK_F_CONFIG_WCE |
83
+ 1ull << VIRTIO_BLK_F_MQ |
84
1ull << VIRTIO_F_VERSION_1 |
85
1ull << VIRTIO_RING_F_INDIRECT_DESC |
86
1ull << VIRTIO_RING_F_EVENT_IDX |
87
@@ -XXX,XX +XXX,XX @@ static void blk_aio_detach(void *opaque)
88
89
static void
90
vu_blk_initialize_config(BlockDriverState *bs,
91
- struct virtio_blk_config *config, uint32_t blk_size)
92
+ struct virtio_blk_config *config,
93
+ uint32_t blk_size,
94
+ uint16_t num_queues)
95
{
96
config->capacity = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
97
config->blk_size = blk_size;
98
@@ -XXX,XX +XXX,XX @@ vu_blk_initialize_config(BlockDriverState *bs,
99
config->seg_max = 128 - 2;
100
config->min_io_size = 1;
101
config->opt_io_size = 1;
102
- config->num_queues = VHOST_USER_BLK_MAX_QUEUES;
103
+ config->num_queues = num_queues;
104
config->max_discard_sectors = 32768;
105
config->max_discard_seg = 1;
106
config->discard_sector_alignment = config->blk_size >> 9;
107
@@ -XXX,XX +XXX,XX @@ static int vu_blk_exp_create(BlockExport *exp, BlockExportOptions *opts,
108
BlockExportOptionsVhostUserBlk *vu_opts = &opts->u.vhost_user_blk;
109
Error *local_err = NULL;
110
uint64_t logical_block_size;
111
+ uint16_t num_queues = VHOST_USER_BLK_NUM_QUEUES_DEFAULT;
112
113
vexp->writable = opts->writable;
114
vexp->blkcfg.wce = 0;
115
@@ -XXX,XX +XXX,XX @@ static int vu_blk_exp_create(BlockExport *exp, BlockExportOptions *opts,
31
}
116
}
32
117
vexp->blk_size = logical_block_size;
33
- bs = qmp_get_root_bs(backup->device, errp);
118
blk_set_guest_block_size(exp->blk, logical_block_size);
34
+ bs = bdrv_lookup_bs(backup->device, backup->device, errp);
119
+
35
if (!bs) {
120
+ if (vu_opts->has_num_queues) {
36
return NULL;
121
+ num_queues = vu_opts->num_queues;
37
}
122
+ }
38
123
+ if (num_queues == 0) {
39
+ if (!bs->drv) {
124
+ error_setg(errp, "num-queues must be greater than 0");
40
+ error_setg(errp, "Device has no medium");
125
+ return -EINVAL;
41
+ return NULL;
42
+ }
126
+ }
43
+
127
+
44
aio_context = bdrv_get_aio_context(bs);
128
vu_blk_initialize_config(blk_bs(exp->blk), &vexp->blkcfg,
45
aio_context_acquire(aio_context);
129
- logical_block_size);
46
130
+ logical_block_size, num_queues);
47
diff --git a/tests/qemu-iotests/056 b/tests/qemu-iotests/056
131
48
index XXXXXXX..XXXXXXX 100755
132
blk_add_aio_context_notifier(exp->blk, blk_aio_attached, blk_aio_detach,
49
--- a/tests/qemu-iotests/056
133
vexp);
50
+++ b/tests/qemu-iotests/056
134
51
@@ -XXX,XX +XXX,XX @@ class BackupTest(iotests.QMPTestCase):
135
if (!vhost_user_server_start(&vexp->vu_server, vu_opts->addr, exp->ctx,
52
res = self.vm.qmp('query-block-jobs')
136
- VHOST_USER_BLK_MAX_QUEUES, &vu_blk_iface,
53
self.assert_qmp(res, 'return[0]/status', 'concluded')
137
- errp)) {
54
# Leave zombie job un-dismissed, observe a failure:
138
+ num_queues, &vu_blk_iface, errp)) {
55
- res = self.qmp_backup_and_wait(serror='Need a root block node',
139
blk_remove_aio_context_notifier(exp->blk, blk_aio_attached,
56
+ res = self.qmp_backup_and_wait(serror="Node 'drive0' is busy: block device is in use by block job: backup",
140
blk_aio_detach, vexp);
57
device='drive0', format=iotests.imgfmt,
141
return -EADDRNOTAVAIL;
58
sync='full', target=self.dest_img,
59
auto_dismiss=False)
60
--
142
--
61
2.21.0
143
2.26.2
62
144
63
diff view generated by jsdifflib
1
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
1
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2
2
3
Encryption will be done in threads, to take benefit of it, we should
3
bdrv_co_block_status_above has several design problems with handling
4
move it out of the lock first.
4
short backing files:
5
6
1. With want_zeros=true, it may return ret with BDRV_BLOCK_ZERO but
7
without BDRV_BLOCK_ALLOCATED flag, when actually short backing file
8
which produces these after-EOF zeros is inside requested backing
9
sequence.
10
11
2. With want_zero=false, it may return pnum=0 prior to actual EOF,
12
because of EOF of short backing file.
13
14
Fix these things, making logic about short backing files clearer.
15
16
With fixed bdrv_block_status_above we also have to improve is_zero in
17
qcow2 code, otherwise iotest 154 will fail, because with this patch we
18
stop to merge zeros of different types (produced by fully unallocated
19
in the whole backing chain regions vs produced by short backing files).
20
21
Note also, that this patch leaves for another day the general problem
22
around block-status: misuse of BDRV_BLOCK_ALLOCATED as is-fs-allocated
23
vs go-to-backing.
5
24
6
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
25
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
7
Reviewed-by: Alberto Garcia <berto@igalia.com>
26
Reviewed-by: Alberto Garcia <berto@igalia.com>
8
Reviewed-by: Max Reitz <mreitz@redhat.com>
27
Reviewed-by: Eric Blake <eblake@redhat.com>
9
Message-id: 20190506142741.41731-8-vsementsov@virtuozzo.com
28
Message-id: 20200924194003.22080-2-vsementsov@virtuozzo.com
10
Signed-off-by: Max Reitz <mreitz@redhat.com>
29
[Fix s/comes/come/ as suggested by Eric Blake
30
--Stefan]
31
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
11
---
32
---
12
block/qcow2.c | 35 +++++++++++++++++++++--------------
33
block/io.c | 68 ++++++++++++++++++++++++++++++++++++++++-----------
13
1 file changed, 21 insertions(+), 14 deletions(-)
34
block/qcow2.c | 16 ++++++++++--
35
2 files changed, 68 insertions(+), 16 deletions(-)
14
36
37
diff --git a/block/io.c b/block/io.c
38
index XXXXXXX..XXXXXXX 100644
39
--- a/block/io.c
40
+++ b/block/io.c
41
@@ -XXX,XX +XXX,XX @@ bdrv_co_common_block_status_above(BlockDriverState *bs,
42
int64_t *map,
43
BlockDriverState **file)
44
{
45
+ int ret;
46
BlockDriverState *p;
47
- int ret = 0;
48
- bool first = true;
49
+ int64_t eof = 0;
50
51
assert(bs != base);
52
- for (p = bs; p != base; p = bdrv_filter_or_cow_bs(p)) {
53
+
54
+ ret = bdrv_co_block_status(bs, want_zero, offset, bytes, pnum, map, file);
55
+ if (ret < 0 || *pnum == 0 || ret & BDRV_BLOCK_ALLOCATED) {
56
+ return ret;
57
+ }
58
+
59
+ if (ret & BDRV_BLOCK_EOF) {
60
+ eof = offset + *pnum;
61
+ }
62
+
63
+ assert(*pnum <= bytes);
64
+ bytes = *pnum;
65
+
66
+ for (p = bdrv_filter_or_cow_bs(bs); p != base;
67
+ p = bdrv_filter_or_cow_bs(p))
68
+ {
69
ret = bdrv_co_block_status(p, want_zero, offset, bytes, pnum, map,
70
file);
71
if (ret < 0) {
72
- break;
73
+ return ret;
74
}
75
- if (ret & BDRV_BLOCK_ZERO && ret & BDRV_BLOCK_EOF && !first) {
76
+ if (*pnum == 0) {
77
/*
78
- * Reading beyond the end of the file continues to read
79
- * zeroes, but we can only widen the result to the
80
- * unallocated length we learned from an earlier
81
- * iteration.
82
+ * The top layer deferred to this layer, and because this layer is
83
+ * short, any zeroes that we synthesize beyond EOF behave as if they
84
+ * were allocated at this layer.
85
+ *
86
+ * We don't include BDRV_BLOCK_EOF into ret, as upper layer may be
87
+ * larger. We'll add BDRV_BLOCK_EOF if needed at function end, see
88
+ * below.
89
*/
90
+ assert(ret & BDRV_BLOCK_EOF);
91
*pnum = bytes;
92
+ if (file) {
93
+ *file = p;
94
+ }
95
+ ret = BDRV_BLOCK_ZERO | BDRV_BLOCK_ALLOCATED;
96
+ break;
97
}
98
- if (ret & (BDRV_BLOCK_ZERO | BDRV_BLOCK_DATA)) {
99
+ if (ret & BDRV_BLOCK_ALLOCATED) {
100
+ /*
101
+ * We've found the node and the status, we must break.
102
+ *
103
+ * Drop BDRV_BLOCK_EOF, as it's not for upper layer, which may be
104
+ * larger. We'll add BDRV_BLOCK_EOF if needed at function end, see
105
+ * below.
106
+ */
107
+ ret &= ~BDRV_BLOCK_EOF;
108
break;
109
}
110
- /* [offset, pnum] unallocated on this layer, which could be only
111
- * the first part of [offset, bytes]. */
112
- bytes = MIN(bytes, *pnum);
113
- first = false;
114
+
115
+ /*
116
+ * OK, [offset, offset + *pnum) region is unallocated on this layer,
117
+ * let's continue the diving.
118
+ */
119
+ assert(*pnum <= bytes);
120
+ bytes = *pnum;
121
+ }
122
+
123
+ if (offset + *pnum == eof) {
124
+ ret |= BDRV_BLOCK_EOF;
125
}
126
+
127
return ret;
128
}
129
15
diff --git a/block/qcow2.c b/block/qcow2.c
130
diff --git a/block/qcow2.c b/block/qcow2.c
16
index XXXXXXX..XXXXXXX 100644
131
index XXXXXXX..XXXXXXX 100644
17
--- a/block/qcow2.c
132
--- a/block/qcow2.c
18
+++ b/block/qcow2.c
133
+++ b/block/qcow2.c
19
@@ -XXX,XX +XXX,XX @@ static coroutine_fn int qcow2_co_pwritev(BlockDriverState *bs, uint64_t offset,
134
@@ -XXX,XX +XXX,XX @@ static bool is_zero(BlockDriverState *bs, int64_t offset, int64_t bytes)
20
ret = qcow2_alloc_cluster_offset(bs, offset, &cur_bytes,
135
if (!bytes) {
21
&cluster_offset, &l2meta);
136
return true;
22
if (ret < 0) {
137
}
23
- goto fail;
138
- res = bdrv_block_status_above(bs, NULL, offset, bytes, &nr, NULL, NULL);
24
+ goto out_locked;
139
- return res >= 0 && (res & BDRV_BLOCK_ZERO) && nr == bytes;
25
}
26
27
assert((cluster_offset & 511) == 0);
28
29
+ ret = qcow2_pre_write_overlap_check(bs, 0,
30
+ cluster_offset + offset_in_cluster,
31
+ cur_bytes, true);
32
+ if (ret < 0) {
33
+ goto out_locked;
34
+ }
35
+
140
+
36
+ qemu_co_mutex_unlock(&s->lock);
141
+ /*
142
+ * bdrv_block_status_above doesn't merge different types of zeros, for
143
+ * example, zeros which come from the region which is unallocated in
144
+ * the whole backing chain, and zeros which come because of a short
145
+ * backing file. So, we need a loop.
146
+ */
147
+ do {
148
+ res = bdrv_block_status_above(bs, NULL, offset, bytes, &nr, NULL, NULL);
149
+ offset += nr;
150
+ bytes -= nr;
151
+ } while (res >= 0 && (res & BDRV_BLOCK_ZERO) && nr && bytes);
37
+
152
+
38
qemu_iovec_reset(&hd_qiov);
153
+ return res >= 0 && (res & BDRV_BLOCK_ZERO) && bytes == 0;
39
qemu_iovec_concat(&hd_qiov, qiov, bytes_done, cur_bytes);
154
}
40
155
41
@@ -XXX,XX +XXX,XX @@ static coroutine_fn int qcow2_co_pwritev(BlockDriverState *bs, uint64_t offset,
156
static coroutine_fn int qcow2_co_pwrite_zeroes(BlockDriverState *bs,
42
* s->cluster_size);
43
if (cluster_data == NULL) {
44
ret = -ENOMEM;
45
- goto fail;
46
+ goto out_unlocked;
47
}
48
}
49
50
@@ -XXX,XX +XXX,XX @@ static coroutine_fn int qcow2_co_pwritev(BlockDriverState *bs, uint64_t offset,
51
cluster_data,
52
cur_bytes, NULL) < 0) {
53
ret = -EIO;
54
- goto fail;
55
+ goto out_unlocked;
56
}
57
58
qemu_iovec_reset(&hd_qiov);
59
qemu_iovec_add(&hd_qiov, cluster_data, cur_bytes);
60
}
61
62
- ret = qcow2_pre_write_overlap_check(bs, 0,
63
- cluster_offset + offset_in_cluster, cur_bytes, true);
64
- if (ret < 0) {
65
- goto fail;
66
- }
67
-
68
/* If we need to do COW, check if it's possible to merge the
69
* writing of the guest data together with that of the COW regions.
70
* If it's not possible (or not necessary) then write the
71
* guest data now. */
72
if (!merge_cow(offset, cur_bytes, &hd_qiov, l2meta)) {
73
- qemu_co_mutex_unlock(&s->lock);
74
BLKDBG_EVENT(bs->file, BLKDBG_WRITE_AIO);
75
trace_qcow2_writev_data(qemu_coroutine_self(),
76
cluster_offset + offset_in_cluster);
77
ret = bdrv_co_pwritev(s->data_file,
78
cluster_offset + offset_in_cluster,
79
cur_bytes, &hd_qiov, 0);
80
- qemu_co_mutex_lock(&s->lock);
81
if (ret < 0) {
82
- goto fail;
83
+ goto out_unlocked;
84
}
85
}
86
87
+ qemu_co_mutex_lock(&s->lock);
88
+
89
ret = qcow2_handle_l2meta(bs, &l2meta, true);
90
if (ret) {
91
- goto fail;
92
+ goto out_locked;
93
}
94
95
bytes -= cur_bytes;
96
@@ -XXX,XX +XXX,XX @@ static coroutine_fn int qcow2_co_pwritev(BlockDriverState *bs, uint64_t offset,
97
trace_qcow2_writev_done_part(qemu_coroutine_self(), cur_bytes);
98
}
99
ret = 0;
100
+ goto out_locked;
101
102
-fail:
103
+out_unlocked:
104
+ qemu_co_mutex_lock(&s->lock);
105
+
106
+out_locked:
107
qcow2_handle_l2meta(bs, &l2meta, false);
108
109
qemu_co_mutex_unlock(&s->lock);
110
--
157
--
111
2.21.0
158
2.26.2
112
159
113
diff view generated by jsdifflib
1
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
1
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2
2
3
We are going to share this bitmap between backup and backup-top filter
3
In order to reuse bdrv_common_block_status_above in
4
driver, so let's share something more meaningful. It also simplifies
4
bdrv_is_allocated_above, let's support include_base parameter.
5
some calculations.
6
5
7
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
6
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
8
Reviewed-by: Max Reitz <mreitz@redhat.com>
7
Reviewed-by: Alberto Garcia <berto@igalia.com>
9
Message-id: 20190429090842.57910-3-vsementsov@virtuozzo.com
8
Reviewed-by: Eric Blake <eblake@redhat.com>
10
Signed-off-by: Max Reitz <mreitz@redhat.com>
9
Message-id: 20200924194003.22080-3-vsementsov@virtuozzo.com
10
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
11
---
11
---
12
block/backup.c | 48 +++++++++++++++++++++++-------------------------
12
block/coroutines.h | 2 ++
13
1 file changed, 23 insertions(+), 25 deletions(-)
13
block/io.c | 21 ++++++++++++++-------
14
2 files changed, 16 insertions(+), 7 deletions(-)
14
15
15
diff --git a/block/backup.c b/block/backup.c
16
diff --git a/block/coroutines.h b/block/coroutines.h
16
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
17
--- a/block/backup.c
18
--- a/block/coroutines.h
18
+++ b/block/backup.c
19
+++ b/block/coroutines.h
19
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn backup_cow_with_bounce_buffer(BackupBlockJob *job,
20
@@ -XXX,XX +XXX,XX @@ bdrv_pwritev(BdrvChild *child, int64_t offset, unsigned int bytes,
20
int read_flags = is_write_notifier ? BDRV_REQ_NO_SERIALISING : 0;
21
int coroutine_fn
21
int write_flags = job->serialize_target_writes ? BDRV_REQ_SERIALISING : 0;
22
bdrv_co_common_block_status_above(BlockDriverState *bs,
22
23
BlockDriverState *base,
23
- hbitmap_reset(job->copy_bitmap, start / job->cluster_size, 1);
24
+ bool include_base,
24
+ assert(QEMU_IS_ALIGNED(start, job->cluster_size));
25
bool want_zero,
25
+ hbitmap_reset(job->copy_bitmap, start, job->cluster_size);
26
int64_t offset,
26
nbytes = MIN(job->cluster_size, job->len - start);
27
int64_t bytes,
27
if (!*bounce_buffer) {
28
@@ -XXX,XX +XXX,XX @@ bdrv_co_common_block_status_above(BlockDriverState *bs,
28
*bounce_buffer = blk_blockalign(blk, job->cluster_size);
29
int generated_co_wrapper
29
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn backup_cow_with_bounce_buffer(BackupBlockJob *job,
30
bdrv_common_block_status_above(BlockDriverState *bs,
30
31
BlockDriverState *base,
31
return nbytes;
32
+ bool include_base,
32
fail:
33
bool want_zero,
33
- hbitmap_set(job->copy_bitmap, start / job->cluster_size, 1);
34
int64_t offset,
34
+ hbitmap_set(job->copy_bitmap, start, job->cluster_size);
35
int64_t bytes,
35
return ret;
36
diff --git a/block/io.c b/block/io.c
36
37
index XXXXXXX..XXXXXXX 100644
37
}
38
--- a/block/io.c
38
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn backup_cow_with_offload(BackupBlockJob *job,
39
+++ b/block/io.c
39
int write_flags = job->serialize_target_writes ? BDRV_REQ_SERIALISING : 0;
40
@@ -XXX,XX +XXX,XX @@ early_out:
40
41
int coroutine_fn
41
assert(QEMU_IS_ALIGNED(job->copy_range_size, job->cluster_size));
42
bdrv_co_common_block_status_above(BlockDriverState *bs,
42
+ assert(QEMU_IS_ALIGNED(start, job->cluster_size));
43
BlockDriverState *base,
43
nbytes = MIN(job->copy_range_size, end - start);
44
+ bool include_base,
44
nr_clusters = DIV_ROUND_UP(nbytes, job->cluster_size);
45
bool want_zero,
45
- hbitmap_reset(job->copy_bitmap, start / job->cluster_size,
46
int64_t offset,
46
- nr_clusters);
47
int64_t bytes,
47
+ hbitmap_reset(job->copy_bitmap, start, job->cluster_size * nr_clusters);
48
@@ -XXX,XX +XXX,XX @@ bdrv_co_common_block_status_above(BlockDriverState *bs,
48
ret = blk_co_copy_range(blk, start, job->target, start, nbytes,
49
BlockDriverState *p;
49
read_flags, write_flags);
50
int64_t eof = 0;
50
if (ret < 0) {
51
51
trace_backup_do_cow_copy_range_fail(job, start, ret);
52
- assert(bs != base);
52
- hbitmap_set(job->copy_bitmap, start / job->cluster_size,
53
+ assert(include_base || bs != base);
53
- nr_clusters);
54
+ assert(!include_base || base); /* Can't include NULL base */
54
+ hbitmap_set(job->copy_bitmap, start, job->cluster_size * nr_clusters);
55
56
ret = bdrv_co_block_status(bs, want_zero, offset, bytes, pnum, map, file);
57
- if (ret < 0 || *pnum == 0 || ret & BDRV_BLOCK_ALLOCATED) {
58
+ if (ret < 0 || *pnum == 0 || ret & BDRV_BLOCK_ALLOCATED || bs == base) {
55
return ret;
59
return ret;
56
}
60
}
57
61
58
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn backup_do_cow(BackupBlockJob *job,
62
@@ -XXX,XX +XXX,XX @@ bdrv_co_common_block_status_above(BlockDriverState *bs,
59
cow_request_begin(&cow_request, job, start, end);
63
assert(*pnum <= bytes);
60
64
bytes = *pnum;
61
while (start < end) {
65
62
- if (!hbitmap_get(job->copy_bitmap, start / job->cluster_size)) {
66
- for (p = bdrv_filter_or_cow_bs(bs); p != base;
63
+ if (!hbitmap_get(job->copy_bitmap, start)) {
67
+ for (p = bdrv_filter_or_cow_bs(bs); include_base || p != base;
64
trace_backup_do_cow_skip(job, start);
68
p = bdrv_filter_or_cow_bs(p))
65
start += job->cluster_size;
66
continue; /* already copied */
67
@@ -XXX,XX +XXX,XX @@ static void backup_clean(Job *job)
68
assert(s->target);
69
blk_unref(s->target);
70
s->target = NULL;
71
+
72
+ if (s->copy_bitmap) {
73
+ hbitmap_free(s->copy_bitmap);
74
+ s->copy_bitmap = NULL;
75
+ }
76
}
77
78
void backup_do_checkpoint(BlockJob *job, Error **errp)
79
{
80
BackupBlockJob *backup_job = container_of(job, BackupBlockJob, common);
81
- int64_t len;
82
83
assert(block_job_driver(job) == &backup_job_driver);
84
85
@@ -XXX,XX +XXX,XX @@ void backup_do_checkpoint(BlockJob *job, Error **errp)
86
return;
87
}
88
89
- len = DIV_ROUND_UP(backup_job->len, backup_job->cluster_size);
90
- hbitmap_set(backup_job->copy_bitmap, 0, len);
91
+ hbitmap_set(backup_job->copy_bitmap, 0, backup_job->len);
92
}
93
94
static void backup_drain(BlockJob *job)
95
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn backup_run_incremental(BackupBlockJob *job)
96
{
97
int ret;
98
bool error_is_read;
99
- int64_t cluster;
100
+ int64_t offset;
101
HBitmapIter hbi;
102
103
hbitmap_iter_init(&hbi, job->copy_bitmap, 0);
104
- while ((cluster = hbitmap_iter_next(&hbi)) != -1) {
105
+ while ((offset = hbitmap_iter_next(&hbi)) != -1) {
106
do {
107
if (yield_and_check(job)) {
108
return 0;
109
}
110
- ret = backup_do_cow(job, cluster * job->cluster_size,
111
+ ret = backup_do_cow(job, offset,
112
job->cluster_size, &error_is_read, false);
113
if (ret < 0 && backup_error_action(job, error_is_read, -ret) ==
114
BLOCK_ERROR_ACTION_REPORT)
115
@@ -XXX,XX +XXX,XX @@ static void backup_incremental_init_copy_bitmap(BackupBlockJob *job)
116
while (bdrv_dirty_bitmap_next_dirty_area(job->sync_bitmap,
117
&offset, &bytes))
118
{
69
{
119
- uint64_t cluster = offset / job->cluster_size;
70
ret = bdrv_co_block_status(p, want_zero, offset, bytes, pnum, map,
120
- uint64_t end_cluster = DIV_ROUND_UP(offset + bytes, job->cluster_size);
71
@@ -XXX,XX +XXX,XX @@ bdrv_co_common_block_status_above(BlockDriverState *bs,
121
+ hbitmap_set(job->copy_bitmap, offset, bytes);
122
123
- hbitmap_set(job->copy_bitmap, cluster, end_cluster - cluster);
124
-
125
- offset = end_cluster * job->cluster_size;
126
+ offset += bytes;
127
if (offset >= job->len) {
128
break;
72
break;
129
}
73
}
130
@@ -XXX,XX +XXX,XX @@ static void backup_incremental_init_copy_bitmap(BackupBlockJob *job)
74
131
75
+ if (p == base) {
132
/* TODO job_progress_set_remaining() would make more sense */
76
+ assert(include_base);
133
job_progress_update(&job->common.job,
77
+ break;
134
- job->len - hbitmap_count(job->copy_bitmap) * job->cluster_size);
78
+ }
135
+ job->len - hbitmap_count(job->copy_bitmap));
79
+
80
/*
81
* OK, [offset, offset + *pnum) region is unallocated on this layer,
82
* let's continue the diving.
83
@@ -XXX,XX +XXX,XX @@ int bdrv_block_status_above(BlockDriverState *bs, BlockDriverState *base,
84
int64_t offset, int64_t bytes, int64_t *pnum,
85
int64_t *map, BlockDriverState **file)
86
{
87
- return bdrv_common_block_status_above(bs, base, true, offset, bytes,
88
+ return bdrv_common_block_status_above(bs, base, false, true, offset, bytes,
89
pnum, map, file);
136
}
90
}
137
91
138
static int coroutine_fn backup_run(Job *job, Error **errp)
92
@@ -XXX,XX +XXX,XX @@ int coroutine_fn bdrv_is_allocated(BlockDriverState *bs, int64_t offset,
139
{
93
int ret;
140
BackupBlockJob *s = container_of(job, BackupBlockJob, common.job);
94
int64_t dummy;
141
BlockDriverState *bs = blk_bs(s->common.blk);
95
142
- int64_t offset, nb_clusters;
96
- ret = bdrv_common_block_status_above(bs, bdrv_filter_or_cow_bs(bs), false,
143
+ int64_t offset;
97
- offset, bytes, pnum ? pnum : &dummy,
144
int ret = 0;
98
- NULL, NULL);
145
99
+ ret = bdrv_common_block_status_above(bs, bs, true, false, offset,
146
QLIST_INIT(&s->inflight_reqs);
100
+ bytes, pnum ? pnum : &dummy, NULL,
147
qemu_co_rwlock_init(&s->flush_rwlock);
101
+ NULL);
148
102
if (ret < 0) {
149
- nb_clusters = DIV_ROUND_UP(s->len, s->cluster_size);
103
return ret;
150
job_progress_set_remaining(job, s->len);
151
152
- s->copy_bitmap = hbitmap_alloc(nb_clusters, 0);
153
if (s->sync_mode == MIRROR_SYNC_MODE_INCREMENTAL) {
154
backup_incremental_init_copy_bitmap(s);
155
} else {
156
- hbitmap_set(s->copy_bitmap, 0, nb_clusters);
157
+ hbitmap_set(s->copy_bitmap, 0, s->len);
158
}
104
}
159
160
-
161
s->before_write.notify = backup_before_write_notify;
162
bdrv_add_before_write_notifier(bs, &s->before_write);
163
164
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn backup_run(Job *job, Error **errp)
165
/* wait until pending backup_do_cow() calls have completed */
166
qemu_co_rwlock_wrlock(&s->flush_rwlock);
167
qemu_co_rwlock_unlock(&s->flush_rwlock);
168
- hbitmap_free(s->copy_bitmap);
169
170
return ret;
171
}
172
@@ -XXX,XX +XXX,XX @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
173
} else {
174
job->cluster_size = MAX(BACKUP_CLUSTER_SIZE_DEFAULT, bdi.cluster_size);
175
}
176
+
177
+ job->copy_bitmap = hbitmap_alloc(len, ctz32(job->cluster_size));
178
job->use_copy_range = true;
179
job->copy_range_size = MIN_NON_ZERO(blk_get_max_transfer(job->common.blk),
180
blk_get_max_transfer(job->target));
181
--
105
--
182
2.21.0
106
2.26.2
183
107
184
diff view generated by jsdifflib
1
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
1
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2
2
3
Split allocation checking to separate function and reduce nesting.
3
We are going to reuse bdrv_common_block_status_above in
4
Consider bdrv_is_allocated() fail as allocated area, as copying more
4
bdrv_is_allocated_above. bdrv_is_allocated_above may be called with
5
than needed is not wrong (and we do it anyway) and seems better than
5
include_base == false and still bs == base (for ex. from img_rebase()).
6
fail the whole job. And, most probably we will fail on the next read,
6
7
if there are real problem with source.
7
So, support this corner case.
8
8
9
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
9
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
10
Reviewed-by: Max Reitz <mreitz@redhat.com>
10
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
11
Message-id: 20190429090842.57910-4-vsementsov@virtuozzo.com
11
Reviewed-by: Eric Blake <eblake@redhat.com>
12
Signed-off-by: Max Reitz <mreitz@redhat.com>
12
Reviewed-by: Alberto Garcia <berto@igalia.com>
13
Message-id: 20200924194003.22080-4-vsementsov@virtuozzo.com
14
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
13
---
15
---
14
block/backup.c | 60 +++++++++++++++++++-------------------------------
16
block/io.c | 6 +++++-
15
1 file changed, 23 insertions(+), 37 deletions(-)
17
1 file changed, 5 insertions(+), 1 deletion(-)
16
18
17
diff --git a/block/backup.c b/block/backup.c
19
diff --git a/block/io.c b/block/io.c
18
index XXXXXXX..XXXXXXX 100644
20
index XXXXXXX..XXXXXXX 100644
19
--- a/block/backup.c
21
--- a/block/io.c
20
+++ b/block/backup.c
22
+++ b/block/io.c
21
@@ -XXX,XX +XXX,XX @@ static bool coroutine_fn yield_and_check(BackupBlockJob *job)
23
@@ -XXX,XX +XXX,XX @@ bdrv_co_common_block_status_above(BlockDriverState *bs,
22
return false;
24
BlockDriverState *p;
23
}
25
int64_t eof = 0;
24
26
25
+static bool bdrv_is_unallocated_range(BlockDriverState *bs,
27
- assert(include_base || bs != base);
26
+ int64_t offset, int64_t bytes)
28
assert(!include_base || base); /* Can't include NULL base */
27
+{
29
28
+ int64_t end = offset + bytes;
30
+ if (!include_base && bs == base) {
29
+
31
+ *pnum = bytes;
30
+ while (offset < end && !bdrv_is_allocated(bs, offset, bytes, &bytes)) {
32
+ return 0;
31
+ if (bytes == 0) {
32
+ return true;
33
+ }
34
+ offset += bytes;
35
+ bytes = end - offset;
36
+ }
33
+ }
37
+
34
+
38
+ return offset >= end;
35
ret = bdrv_co_block_status(bs, want_zero, offset, bytes, pnum, map, file);
39
+}
36
if (ret < 0 || *pnum == 0 || ret & BDRV_BLOCK_ALLOCATED || bs == base) {
40
+
37
return ret;
41
static int coroutine_fn backup_run_incremental(BackupBlockJob *job)
42
{
43
int ret;
44
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn backup_run(Job *job, Error **errp)
45
for (offset = 0; offset < s->len;
46
offset += s->cluster_size) {
47
bool error_is_read;
48
- int alloced = 0;
49
50
if (yield_and_check(s)) {
51
break;
52
}
53
54
- if (s->sync_mode == MIRROR_SYNC_MODE_TOP) {
55
- int i;
56
- int64_t n;
57
-
58
- /* Check to see if these blocks are already in the
59
- * backing file. */
60
-
61
- for (i = 0; i < s->cluster_size;) {
62
- /* bdrv_is_allocated() only returns true/false based
63
- * on the first set of sectors it comes across that
64
- * are are all in the same state.
65
- * For that reason we must verify each sector in the
66
- * backup cluster length. We end up copying more than
67
- * needed but at some point that is always the case. */
68
- alloced =
69
- bdrv_is_allocated(bs, offset + i,
70
- s->cluster_size - i, &n);
71
- i += n;
72
-
73
- if (alloced || n == 0) {
74
- break;
75
- }
76
- }
77
-
78
- /* If the above loop never found any sectors that are in
79
- * the topmost image, skip this backup. */
80
- if (alloced == 0) {
81
- continue;
82
- }
83
- }
84
- /* FULL sync mode we copy the whole drive. */
85
- if (alloced < 0) {
86
- ret = alloced;
87
- } else {
88
- ret = backup_do_cow(s, offset, s->cluster_size,
89
- &error_is_read, false);
90
+ if (s->sync_mode == MIRROR_SYNC_MODE_TOP &&
91
+ bdrv_is_unallocated_range(bs, offset, s->cluster_size))
92
+ {
93
+ continue;
94
}
95
+
96
+ ret = backup_do_cow(s, offset, s->cluster_size,
97
+ &error_is_read, false);
98
if (ret < 0) {
99
/* Depending on error action, fail now or retry cluster */
100
BlockErrorAction action =
101
--
38
--
102
2.21.0
39
2.26.2
103
40
104
diff view generated by jsdifflib
1
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
1
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2
2
3
Simplify backup_incremental_init_copy_bitmap using the function
3
bdrv_is_allocated_above wrongly handles short backing files: it reports
4
bdrv_dirty_bitmap_next_dirty_area.
4
after-EOF space as UNALLOCATED which is wrong, as on read the data is
5
generated on the level of short backing file (if all overlays have
6
unallocated areas at that place).
5
7
6
Note: move to job->len instead of bitmap size: it should not matter but
8
Reusing bdrv_common_block_status_above fixes the issue and unifies code
7
less code.
9
path.
8
10
9
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
11
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
10
Reviewed-by: Max Reitz <mreitz@redhat.com>
12
Reviewed-by: Eric Blake <eblake@redhat.com>
11
Message-id: 20190429090842.57910-2-vsementsov@virtuozzo.com
13
Reviewed-by: Alberto Garcia <berto@igalia.com>
12
Signed-off-by: Max Reitz <mreitz@redhat.com>
14
Message-id: 20200924194003.22080-5-vsementsov@virtuozzo.com
15
[Fix s/has/have/ as suggested by Eric Blake. Fix s/area/areas/.
16
--Stefan]
17
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
13
---
18
---
14
block/backup.c | 40 ++++++++++++----------------------------
19
block/io.c | 43 +++++--------------------------------------
15
1 file changed, 12 insertions(+), 28 deletions(-)
20
1 file changed, 5 insertions(+), 38 deletions(-)
16
21
17
diff --git a/block/backup.c b/block/backup.c
22
diff --git a/block/io.c b/block/io.c
18
index XXXXXXX..XXXXXXX 100644
23
index XXXXXXX..XXXXXXX 100644
19
--- a/block/backup.c
24
--- a/block/io.c
20
+++ b/block/backup.c
25
+++ b/block/io.c
21
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn backup_run_incremental(BackupBlockJob *job)
26
@@ -XXX,XX +XXX,XX @@ int coroutine_fn bdrv_is_allocated(BlockDriverState *bs, int64_t offset,
22
/* init copy_bitmap from sync_bitmap */
27
* at 'offset + *pnum' may return the same allocation status (in other
23
static void backup_incremental_init_copy_bitmap(BackupBlockJob *job)
28
* words, the result is not necessarily the maximum possible range);
29
* but 'pnum' will only be 0 when end of file is reached.
30
- *
31
*/
32
int bdrv_is_allocated_above(BlockDriverState *top,
33
BlockDriverState *base,
34
bool include_base, int64_t offset,
35
int64_t bytes, int64_t *pnum)
24
{
36
{
25
- BdrvDirtyBitmapIter *dbi;
37
- BlockDriverState *intermediate;
26
- int64_t offset;
38
- int ret;
27
- int64_t end = DIV_ROUND_UP(bdrv_dirty_bitmap_size(job->sync_bitmap),
39
- int64_t n = bytes;
28
- job->cluster_size);
29
-
40
-
30
- dbi = bdrv_dirty_iter_new(job->sync_bitmap);
41
- assert(base || !include_base);
31
- while ((offset = bdrv_dirty_iter_next(dbi)) != -1) {
32
- int64_t cluster = offset / job->cluster_size;
33
- int64_t next_cluster;
34
-
42
-
35
- offset += bdrv_dirty_bitmap_granularity(job->sync_bitmap);
43
- intermediate = top;
36
- if (offset >= bdrv_dirty_bitmap_size(job->sync_bitmap)) {
44
- while (include_base || intermediate != base) {
37
- hbitmap_set(job->copy_bitmap, cluster, end - cluster);
45
- int64_t pnum_inter;
46
- int64_t size_inter;
47
-
48
- assert(intermediate);
49
- ret = bdrv_is_allocated(intermediate, offset, bytes, &pnum_inter);
50
- if (ret < 0) {
51
- return ret;
52
- }
53
- if (ret) {
54
- *pnum = pnum_inter;
55
- return 1;
56
- }
57
-
58
- size_inter = bdrv_getlength(intermediate);
59
- if (size_inter < 0) {
60
- return size_inter;
61
- }
62
- if (n > pnum_inter &&
63
- (intermediate == top || offset + pnum_inter < size_inter)) {
64
- n = pnum_inter;
65
- }
66
-
67
- if (intermediate == base) {
38
- break;
68
- break;
39
- }
69
- }
40
+ uint64_t offset = 0;
41
+ uint64_t bytes = job->len;
42
43
- offset = bdrv_dirty_bitmap_next_zero(job->sync_bitmap, offset,
44
- UINT64_MAX);
45
- if (offset == -1) {
46
- hbitmap_set(job->copy_bitmap, cluster, end - cluster);
47
- break;
48
- }
49
+ while (bdrv_dirty_bitmap_next_dirty_area(job->sync_bitmap,
50
+ &offset, &bytes))
51
+ {
52
+ uint64_t cluster = offset / job->cluster_size;
53
+ uint64_t end_cluster = DIV_ROUND_UP(offset + bytes, job->cluster_size);
54
55
- next_cluster = DIV_ROUND_UP(offset, job->cluster_size);
56
- hbitmap_set(job->copy_bitmap, cluster, next_cluster - cluster);
57
- if (next_cluster >= end) {
58
+ hbitmap_set(job->copy_bitmap, cluster, end_cluster - cluster);
59
+
60
+ offset = end_cluster * job->cluster_size;
61
+ if (offset >= job->len) {
62
break;
63
}
64
-
70
-
65
- bdrv_set_dirty_iter(dbi, next_cluster * job->cluster_size);
71
- intermediate = bdrv_filter_or_cow_bs(intermediate);
66
+ bytes = job->len - offset;
72
+ int ret = bdrv_common_block_status_above(top, base, include_base, false,
73
+ offset, bytes, pnum, NULL, NULL);
74
+ if (ret < 0) {
75
+ return ret;
67
}
76
}
68
77
69
/* TODO job_progress_set_remaining() would make more sense */
78
- *pnum = n;
70
job_progress_update(&job->common.job,
79
- return 0;
71
job->len - hbitmap_count(job->copy_bitmap) * job->cluster_size);
80
+ return !!(ret & BDRV_BLOCK_ALLOCATED);
72
-
73
- bdrv_dirty_iter_free(dbi);
74
}
81
}
75
82
76
static int coroutine_fn backup_run(Job *job, Error **errp)
83
int coroutine_fn
77
--
84
--
78
2.21.0
85
2.26.2
79
86
80
diff view generated by jsdifflib
1
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
1
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2
2
3
Drop dependence on AioContext lock.
3
These cases are fixed by previous patches around block_status and
4
is_allocated.
4
5
5
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
6
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
7
Reviewed-by: Eric Blake <eblake@redhat.com>
6
Reviewed-by: Alberto Garcia <berto@igalia.com>
8
Reviewed-by: Alberto Garcia <berto@igalia.com>
7
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
9
Message-id: 20200924194003.22080-6-vsementsov@virtuozzo.com
8
Reviewed-by: Max Reitz <mreitz@redhat.com>
10
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
9
Message-id: 20190506142741.41731-5-vsementsov@virtuozzo.com
10
Signed-off-by: Max Reitz <mreitz@redhat.com>
11
---
11
---
12
block/qcow2-threads.c | 10 +++++++---
12
tests/qemu-iotests/274 | 20 +++++++++++
13
1 file changed, 7 insertions(+), 3 deletions(-)
13
tests/qemu-iotests/274.out | 68 ++++++++++++++++++++++++++++++++++++++
14
2 files changed, 88 insertions(+)
14
15
15
diff --git a/block/qcow2-threads.c b/block/qcow2-threads.c
16
diff --git a/tests/qemu-iotests/274 b/tests/qemu-iotests/274
17
index XXXXXXX..XXXXXXX 100755
18
--- a/tests/qemu-iotests/274
19
+++ b/tests/qemu-iotests/274
20
@@ -XXX,XX +XXX,XX @@ with iotests.FilePath('base') as base, \
21
iotests.qemu_io_log('-c', 'read -P 1 0 %d' % size_short, mid)
22
iotests.qemu_io_log('-c', 'read -P 0 %d %d' % (size_short, size_diff), mid)
23
24
+ iotests.log('=== Testing qemu-img commit (top -> base) ===')
25
+
26
+ create_chain()
27
+ iotests.qemu_img_log('commit', '-b', base, top)
28
+ iotests.img_info_log(base)
29
+ iotests.qemu_io_log('-c', 'read -P 1 0 %d' % size_short, base)
30
+ iotests.qemu_io_log('-c', 'read -P 0 %d %d' % (size_short, size_diff), base)
31
+
32
+ iotests.log('=== Testing QMP active commit (top -> base) ===')
33
+
34
+ create_chain()
35
+ with create_vm() as vm:
36
+ vm.launch()
37
+ vm.qmp_log('block-commit', device='top', base_node='base',
38
+ job_id='job0', auto_dismiss=False)
39
+ vm.run_job('job0', wait=5)
40
+
41
+ iotests.img_info_log(mid)
42
+ iotests.qemu_io_log('-c', 'read -P 1 0 %d' % size_short, base)
43
+ iotests.qemu_io_log('-c', 'read -P 0 %d %d' % (size_short, size_diff), base)
44
45
iotests.log('== Resize tests ==')
46
47
diff --git a/tests/qemu-iotests/274.out b/tests/qemu-iotests/274.out
16
index XXXXXXX..XXXXXXX 100644
48
index XXXXXXX..XXXXXXX 100644
17
--- a/block/qcow2-threads.c
49
--- a/tests/qemu-iotests/274.out
18
+++ b/block/qcow2-threads.c
50
+++ b/tests/qemu-iotests/274.out
19
@@ -XXX,XX +XXX,XX @@ qcow2_co_do_compress(BlockDriverState *bs, void *dest, size_t dest_size,
51
@@ -XXX,XX +XXX,XX @@ read 1048576/1048576 bytes at offset 0
20
.func = func,
52
read 1048576/1048576 bytes at offset 1048576
21
};
53
1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
22
54
23
+ qemu_co_mutex_lock(&s->lock);
55
+=== Testing qemu-img commit (top -> base) ===
24
while (s->nb_compress_threads >= MAX_COMPRESS_THREADS) {
56
+Formatting 'TEST_DIR/PID-base', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=2097152 lazy_refcounts=off refcount_bits=16
25
- qemu_co_queue_wait(&s->compress_wait_queue, NULL);
26
+ qemu_co_queue_wait(&s->compress_wait_queue, &s->lock);
27
}
28
-
29
s->nb_compress_threads++;
30
+ qemu_co_mutex_unlock(&s->lock);
31
+
57
+
32
thread_pool_submit_co(pool, qcow2_compress_pool_func, &arg);
58
+Formatting 'TEST_DIR/PID-mid', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=1048576 backing_file=TEST_DIR/PID-base backing_fmt=qcow2 lazy_refcounts=off refcount_bits=16
33
- s->nb_compress_threads--;
59
+
34
60
+Formatting 'TEST_DIR/PID-top', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=2097152 backing_file=TEST_DIR/PID-mid backing_fmt=qcow2 lazy_refcounts=off refcount_bits=16
35
+ qemu_co_mutex_lock(&s->lock);
61
+
36
+ s->nb_compress_threads--;
62
+wrote 2097152/2097152 bytes at offset 0
37
qemu_co_queue_next(&s->compress_wait_queue);
63
+2 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
38
+ qemu_co_mutex_unlock(&s->lock);
64
+
39
65
+Image committed.
40
return arg.ret;
66
+
41
}
67
+image: TEST_IMG
68
+file format: IMGFMT
69
+virtual size: 2 MiB (2097152 bytes)
70
+cluster_size: 65536
71
+Format specific information:
72
+ compat: 1.1
73
+ compression type: zlib
74
+ lazy refcounts: false
75
+ refcount bits: 16
76
+ corrupt: false
77
+ extended l2: false
78
+
79
+read 1048576/1048576 bytes at offset 0
80
+1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
81
+
82
+read 1048576/1048576 bytes at offset 1048576
83
+1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
84
+
85
+=== Testing QMP active commit (top -> base) ===
86
+Formatting 'TEST_DIR/PID-base', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=2097152 lazy_refcounts=off refcount_bits=16
87
+
88
+Formatting 'TEST_DIR/PID-mid', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=1048576 backing_file=TEST_DIR/PID-base backing_fmt=qcow2 lazy_refcounts=off refcount_bits=16
89
+
90
+Formatting 'TEST_DIR/PID-top', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=2097152 backing_file=TEST_DIR/PID-mid backing_fmt=qcow2 lazy_refcounts=off refcount_bits=16
91
+
92
+wrote 2097152/2097152 bytes at offset 0
93
+2 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
94
+
95
+{"execute": "block-commit", "arguments": {"auto-dismiss": false, "base-node": "base", "device": "top", "job-id": "job0"}}
96
+{"return": {}}
97
+{"execute": "job-complete", "arguments": {"id": "job0"}}
98
+{"return": {}}
99
+{"data": {"device": "job0", "len": 1048576, "offset": 1048576, "speed": 0, "type": "commit"}, "event": "BLOCK_JOB_READY", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
100
+{"data": {"device": "job0", "len": 1048576, "offset": 1048576, "speed": 0, "type": "commit"}, "event": "BLOCK_JOB_COMPLETED", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
101
+{"execute": "job-dismiss", "arguments": {"id": "job0"}}
102
+{"return": {}}
103
+image: TEST_IMG
104
+file format: IMGFMT
105
+virtual size: 1 MiB (1048576 bytes)
106
+cluster_size: 65536
107
+backing file: TEST_DIR/PID-base
108
+backing file format: IMGFMT
109
+Format specific information:
110
+ compat: 1.1
111
+ compression type: zlib
112
+ lazy refcounts: false
113
+ refcount bits: 16
114
+ corrupt: false
115
+ extended l2: false
116
+
117
+read 1048576/1048576 bytes at offset 0
118
+1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
119
+
120
+read 1048576/1048576 bytes at offset 1048576
121
+1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
122
+
123
== Resize tests ==
124
=== preallocation=off ===
125
Formatting 'TEST_DIR/PID-base', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=6442450944 lazy_refcounts=off refcount_bits=16
42
--
126
--
43
2.21.0
127
2.26.2
44
128
45
diff view generated by jsdifflib