1
The following changes since commit 3fbd3405d2b0604ea530fc7a1828f19da1e95ff9:
1
The following changes since commit ac793156f650ae2d77834932d72224175ee69086:
2
2
3
Merge remote-tracking branch 'remotes/huth-gitlab/tags/pull-request-2019-08-17' into staging (2019-08-19 14:14:09 +0100)
3
Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20201020-1' into staging (2020-10-20 21:11:35 +0100)
4
4
5
are available in the Git repository at:
5
are available in the Git repository at:
6
6
7
https://github.com/XanClic/qemu.git tags/pull-block-2019-08-19
7
https://gitlab.com/stefanha/qemu.git tags/block-pull-request
8
8
9
for you to fetch changes up to fa27c478102a6b5d1c6b02c005607ad9404b915f:
9
for you to fetch changes up to 32a3fd65e7e3551337fd26bfc0e2f899d70c028c:
10
10
11
doc: Preallocation does not require writing zeroes (2019-08-19 17:13:26 +0200)
11
iotests: add commit top->base cases to 274 (2020-10-22 09:55:39 +0100)
12
12
13
----------------------------------------------------------------
13
----------------------------------------------------------------
14
Block patches:
14
Pull request
15
- preallocation=falloc/full support for LUKS
15
16
- Various minor fixes
16
v2:
17
* Fix format string issues on 32-bit hosts [Peter]
18
* Fix qemu-nbd.c CONFIG_POSIX ifdef issue [Eric]
19
* Fix missing eventfd.h header on macOS [Peter]
20
* Drop unreliable vhost-user-blk test (will send a new patch when ready) [Peter]
21
22
This pull request contains the vhost-user-blk server by Coiby Xu along with my
23
additions, block/nvme.c alignment and hardware error statistics by Philippe
24
Mathieu-Daudé, and bdrv_co_block_status_above() fixes by Vladimir
25
Sementsov-Ogievskiy.
17
26
18
----------------------------------------------------------------
27
----------------------------------------------------------------
19
Max Reitz (16):
20
qemu-img: Fix bdrv_has_zero_init() use in convert
21
mirror: Fix bdrv_has_zero_init() use
22
block: Add bdrv_has_zero_init_truncate()
23
block: Implement .bdrv_has_zero_init_truncate()
24
block: Use bdrv_has_zero_init_truncate()
25
qcow2: Fix .bdrv_has_zero_init()
26
vdi: Fix .bdrv_has_zero_init()
27
vhdx: Fix .bdrv_has_zero_init()
28
iotests: Convert to preallocated encrypted qcow2
29
iotests: Test convert -n to pre-filled image
30
iotests: Full mirror to existing non-zero image
31
vdi: Make block_status recurse for fixed images
32
vmdk: Make block_status recurse for flat extents
33
vpc: Do not return RAW from block_status
34
iotests: Fix 141 when run with qed
35
doc: Preallocation does not require writing zeroes
36
28
37
Maxim Levitsky (1):
29
Coiby Xu (6):
38
LUKS: support preallocation
30
libvhost-user: Allow vu_message_read to be replaced
31
libvhost-user: remove watch for kick_fd when de-initialize vu-dev
32
util/vhost-user-server: generic vhost user server
33
block: move logical block size check function to a common utility
34
function
35
block/export: vhost-user block device backend server
36
MAINTAINERS: Add vhost-user block device backend server maintainer
39
37
40
qapi/block-core.json | 15 +++++---
38
Philippe Mathieu-Daudé (1):
41
include/block/block.h | 1 +
39
block/nvme: Add driver statistics for access alignment and hw errors
42
include/block/block_int.h | 9 +++++
40
43
block.c | 21 +++++++++++
41
Stefan Hajnoczi (16):
44
block/crypto.c | 30 ++++++++++++++--
42
util/vhost-user-server: s/fileds/fields/ typo fix
45
block/file-posix.c | 1 +
43
util/vhost-user-server: drop unnecessary QOM cast
46
block/file-win32.c | 1 +
44
util/vhost-user-server: drop unnecessary watch deletion
47
block/gluster.c | 4 +++
45
block/export: consolidate request structs into VuBlockReq
48
block/mirror.c | 11 ++++--
46
util/vhost-user-server: drop unused DevicePanicNotifier
49
block/nfs.c | 1 +
47
util/vhost-user-server: fix memory leak in vu_message_read()
50
block/parallels.c | 2 +-
48
util/vhost-user-server: check EOF when reading payload
51
block/qcow2.c | 30 +++++++++++++++-
49
util/vhost-user-server: rework vu_client_trip() coroutine lifecycle
52
block/qed.c | 1 +
50
block/export: report flush errors
53
block/raw-format.c | 6 ++++
51
block/export: convert vhost-user-blk server to block export API
54
block/rbd.c | 1 +
52
util/vhost-user-server: move header to include/
55
block/sheepdog.c | 1 +
53
util/vhost-user-server: use static library in meson.build
56
block/ssh.c | 1 +
54
qemu-storage-daemon: avoid compiling blockdev_ss twice
57
block/vdi.c | 16 +++++++--
55
block: move block exports to libblockdev
58
block/vhdx.c | 28 +++++++++++++--
56
block/export: add iothread and fixed-iothread options
59
block/vmdk.c | 3 ++
57
block/export: add vhost-user-blk multi-queue support
60
block/vpc.c | 2 +-
58
61
blockdev.c | 16 +++++++--
59
Vladimir Sementsov-Ogievskiy (5):
62
qemu-img.c | 11 ++++--
60
block/io: fix bdrv_co_block_status_above
63
tests/test-block-iothread.c | 2 +-
61
block/io: bdrv_common_block_status_above: support include_base
64
docs/qemu-block-drivers.texi | 4 +--
62
block/io: bdrv_common_block_status_above: support bs == base
65
qemu-img.texi | 4 +--
63
block/io: fix bdrv_is_allocated_above
66
tests/qemu-iotests/041 | 62 +++++++++++++++++++++++++++++---
64
iotests: add commit top->base cases to 274
67
tests/qemu-iotests/041.out | 4 +--
65
68
tests/qemu-iotests/122 | 17 +++++++++
66
MAINTAINERS | 9 +
69
tests/qemu-iotests/122.out | 8 +++++
67
qapi/block-core.json | 24 +-
70
tests/qemu-iotests/141 | 9 +++--
68
qapi/block-export.json | 36 +-
71
tests/qemu-iotests/141.out | 5 ---
69
block/coroutines.h | 2 +
72
tests/qemu-iotests/188 | 20 ++++++++++-
70
block/export/vhost-user-blk-server.h | 19 +
73
tests/qemu-iotests/188.out | 4 +++
71
contrib/libvhost-user/libvhost-user.h | 21 +
74
tests/qemu-iotests/common.filter | 5 +++
72
include/qemu/vhost-user-server.h | 65 +++
75
35 files changed, 313 insertions(+), 43 deletions(-)
73
util/block-helpers.h | 19 +
74
block/export/export.c | 37 +-
75
block/export/vhost-user-blk-server.c | 431 ++++++++++++++++++++
76
block/io.c | 132 +++---
77
block/nvme.c | 27 ++
78
block/qcow2.c | 16 +-
79
contrib/libvhost-user/libvhost-user-glib.c | 2 +-
80
contrib/libvhost-user/libvhost-user.c | 15 +-
81
hw/core/qdev-properties-system.c | 31 +-
82
nbd/server.c | 2 -
83
qemu-nbd.c | 21 +-
84
softmmu/vl.c | 4 +
85
stubs/blk-exp-close-all.c | 7 +
86
tests/vhost-user-bridge.c | 2 +
87
tools/virtiofsd/fuse_virtio.c | 4 +-
88
util/block-helpers.c | 46 +++
89
util/vhost-user-server.c | 446 +++++++++++++++++++++
90
block/export/meson.build | 3 +-
91
contrib/libvhost-user/meson.build | 1 +
92
meson.build | 22 +-
93
nbd/meson.build | 2 +
94
storage-daemon/meson.build | 3 +-
95
stubs/meson.build | 1 +
96
tests/qemu-iotests/274 | 20 +
97
tests/qemu-iotests/274.out | 68 ++++
98
util/meson.build | 4 +
99
33 files changed, 1420 insertions(+), 122 deletions(-)
100
create mode 100644 block/export/vhost-user-blk-server.h
101
create mode 100644 include/qemu/vhost-user-server.h
102
create mode 100644 util/block-helpers.h
103
create mode 100644 block/export/vhost-user-blk-server.c
104
create mode 100644 stubs/blk-exp-close-all.c
105
create mode 100644 util/block-helpers.c
106
create mode 100644 util/vhost-user-server.c
76
107
77
--
108
--
78
2.21.0
109
2.26.2
79
110
80
diff view generated by jsdifflib
1
When preallocating an encrypted qcow2 image, it just lets the protocol
1
From: Philippe Mathieu-Daudé <philmd@redhat.com>
2
driver write data and then does not mark the clusters as zero.
3
Therefore, reading this image will yield effectively random data.
4
2
5
As such, we have not fulfilled the promise of always writing zeroes when
3
Keep statistics of some hardware errors, and number of
6
preallocating an image in a while. It seems that nobody has really
4
aligned/unaligned I/O accesses.
7
cared, so change the documentation to conform to qemu's actual behavior.
8
5
9
Signed-off-by: Max Reitz <mreitz@redhat.com>
6
QMP example booting a full RHEL 8.3 aarch64 guest:
10
Message-id: 20190711132935.13070-1-mreitz@redhat.com
7
11
Reviewed-by: Eric Blake <eblake@redhat.com>
8
{ "execute": "query-blockstats" }
12
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
9
{
13
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
10
"return": [
14
Signed-off-by: Max Reitz <mreitz@redhat.com>
11
{
12
"device": "",
13
"node-name": "drive0",
14
"stats": {
15
"flush_total_time_ns": 6026948,
16
"wr_highest_offset": 3383991230464,
17
"wr_total_time_ns": 807450995,
18
"failed_wr_operations": 0,
19
"failed_rd_operations": 0,
20
"wr_merged": 3,
21
"wr_bytes": 50133504,
22
"failed_unmap_operations": 0,
23
"failed_flush_operations": 0,
24
"account_invalid": false,
25
"rd_total_time_ns": 1846979900,
26
"flush_operations": 130,
27
"wr_operations": 659,
28
"rd_merged": 1192,
29
"rd_bytes": 218244096,
30
"account_failed": false,
31
"idle_time_ns": 2678641497,
32
"rd_operations": 7406,
33
},
34
"driver-specific": {
35
"driver": "nvme",
36
"completion-errors": 0,
37
"unaligned-accesses": 2959,
38
"aligned-accesses": 4477
39
},
40
"qdev": "/machine/peripheral-anon/device[0]/virtio-backend"
41
}
42
]
43
}
44
45
Suggested-by: Stefan Hajnoczi <stefanha@gmail.com>
46
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
47
Acked-by: Markus Armbruster <armbru@redhat.com>
48
Message-id: 20201001162939.1567915-1-philmd@redhat.com
49
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
15
---
50
---
16
qapi/block-core.json | 9 +++++----
51
qapi/block-core.json | 24 +++++++++++++++++++++++-
17
docs/qemu-block-drivers.texi | 4 ++--
52
block/nvme.c | 27 +++++++++++++++++++++++++++
18
qemu-img.texi | 4 ++--
53
2 files changed, 50 insertions(+), 1 deletion(-)
19
3 files changed, 9 insertions(+), 8 deletions(-)
20
54
21
diff --git a/qapi/block-core.json b/qapi/block-core.json
55
diff --git a/qapi/block-core.json b/qapi/block-core.json
22
index XXXXXXX..XXXXXXX 100644
56
index XXXXXXX..XXXXXXX 100644
23
--- a/qapi/block-core.json
57
--- a/qapi/block-core.json
24
+++ b/qapi/block-core.json
58
+++ b/qapi/block-core.json
25
@@ -XXX,XX +XXX,XX @@
59
@@ -XXX,XX +XXX,XX @@
26
# @off: no preallocation
60
'discard-nb-failed': 'uint64',
27
# @metadata: preallocate only for metadata
61
'discard-bytes-ok': 'uint64' } }
28
# @falloc: like @full preallocation but allocate disk space by
62
29
-# posix_fallocate() rather than writing zeros.
63
+##
30
-# @full: preallocate all data by writing zeros to device to ensure disk
64
+# @BlockStatsSpecificNvme:
31
-# space is really available. @full preallocation also sets up
65
+#
32
-# metadata correctly.
66
+# NVMe driver statistics
33
+# posix_fallocate() rather than writing data.
67
+#
34
+# @full: preallocate all data by writing it to the device to ensure
68
+# @completion-errors: The number of completion errors.
35
+# disk space is really available. This data may or may not be
69
+#
36
+# zero, depending on the image format and storage.
70
+# @aligned-accesses: The number of aligned accesses performed by
37
+# @full preallocation also sets up metadata correctly.
71
+# the driver.
72
+#
73
+# @unaligned-accesses: The number of unaligned accesses performed by
74
+# the driver.
75
+#
76
+# Since: 5.2
77
+##
78
+{ 'struct': 'BlockStatsSpecificNvme',
79
+ 'data': {
80
+ 'completion-errors': 'uint64',
81
+ 'aligned-accesses': 'uint64',
82
+ 'unaligned-accesses': 'uint64' } }
83
+
84
##
85
# @BlockStatsSpecific:
38
#
86
#
39
# Since: 2.2
87
@@ -XXX,XX +XXX,XX @@
88
'discriminator': 'driver',
89
'data': {
90
'file': 'BlockStatsSpecificFile',
91
- 'host_device': 'BlockStatsSpecificFile' } }
92
+ 'host_device': 'BlockStatsSpecificFile',
93
+ 'nvme': 'BlockStatsSpecificNvme' } }
94
40
##
95
##
41
diff --git a/docs/qemu-block-drivers.texi b/docs/qemu-block-drivers.texi
96
# @BlockStats:
97
diff --git a/block/nvme.c b/block/nvme.c
42
index XXXXXXX..XXXXXXX 100644
98
index XXXXXXX..XXXXXXX 100644
43
--- a/docs/qemu-block-drivers.texi
99
--- a/block/nvme.c
44
+++ b/docs/qemu-block-drivers.texi
100
+++ b/block/nvme.c
45
@@ -XXX,XX +XXX,XX @@ Supported options:
101
@@ -XXX,XX +XXX,XX @@ struct BDRVNVMeState {
46
@item preallocation
102
47
Preallocation mode (allowed values: @code{off}, @code{falloc}, @code{full}).
103
/* PCI address (required for nvme_refresh_filename()) */
48
@code{falloc} mode preallocates space for image by calling posix_fallocate().
104
char *device;
49
-@code{full} mode preallocates space for image by writing zeros to underlying
105
+
50
-storage.
106
+ struct {
51
+@code{full} mode preallocates space for image by writing data to underlying
107
+ uint64_t completion_errors;
52
+storage. This data may or may not be zero, depending on the storage location.
108
+ uint64_t aligned_accesses;
53
@end table
109
+ uint64_t unaligned_accesses;
54
110
+ } stats;
55
@item qcow2
111
};
56
diff --git a/qemu-img.texi b/qemu-img.texi
112
57
index XXXXXXX..XXXXXXX 100644
113
#define NVME_BLOCK_OPT_DEVICE "device"
58
--- a/qemu-img.texi
114
@@ -XXX,XX +XXX,XX @@ static bool nvme_process_completion(NVMeQueuePair *q)
59
+++ b/qemu-img.texi
115
break;
60
@@ -XXX,XX +XXX,XX @@ Supported options:
116
}
61
@item preallocation
117
ret = nvme_translate_error(c);
62
Preallocation mode (allowed values: @code{off}, @code{falloc}, @code{full}).
118
+ if (ret) {
63
@code{falloc} mode preallocates space for image by calling posix_fallocate().
119
+ s->stats.completion_errors++;
64
-@code{full} mode preallocates space for image by writing zeros to underlying
120
+ }
65
-storage.
121
q->cq.head = (q->cq.head + 1) % NVME_QUEUE_SIZE;
66
+@code{full} mode preallocates space for image by writing data to underlying
122
if (!q->cq.head) {
67
+storage. This data may or may not be zero, depending on the storage location.
123
q->cq_phase = !q->cq_phase;
68
@end table
124
@@ -XXX,XX +XXX,XX @@ static int nvme_co_prw(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
69
125
assert(QEMU_IS_ALIGNED(bytes, s->page_size));
70
@item qcow2
126
assert(bytes <= s->max_transfer);
127
if (nvme_qiov_aligned(bs, qiov)) {
128
+ s->stats.aligned_accesses++;
129
return nvme_co_prw_aligned(bs, offset, bytes, qiov, is_write, flags);
130
}
131
+ s->stats.unaligned_accesses++;
132
trace_nvme_prw_buffered(s, offset, bytes, qiov->niov, is_write);
133
buf = qemu_try_memalign(s->page_size, bytes);
134
135
@@ -XXX,XX +XXX,XX @@ static void nvme_unregister_buf(BlockDriverState *bs, void *host)
136
qemu_vfio_dma_unmap(s->vfio, host);
137
}
138
139
+static BlockStatsSpecific *nvme_get_specific_stats(BlockDriverState *bs)
140
+{
141
+ BlockStatsSpecific *stats = g_new(BlockStatsSpecific, 1);
142
+ BDRVNVMeState *s = bs->opaque;
143
+
144
+ stats->driver = BLOCKDEV_DRIVER_NVME;
145
+ stats->u.nvme = (BlockStatsSpecificNvme) {
146
+ .completion_errors = s->stats.completion_errors,
147
+ .aligned_accesses = s->stats.aligned_accesses,
148
+ .unaligned_accesses = s->stats.unaligned_accesses,
149
+ };
150
+
151
+ return stats;
152
+}
153
+
154
static const char *const nvme_strong_runtime_opts[] = {
155
NVME_BLOCK_OPT_DEVICE,
156
NVME_BLOCK_OPT_NAMESPACE,
157
@@ -XXX,XX +XXX,XX @@ static BlockDriver bdrv_nvme = {
158
.bdrv_refresh_filename = nvme_refresh_filename,
159
.bdrv_refresh_limits = nvme_refresh_limits,
160
.strong_runtime_opts = nvme_strong_runtime_opts,
161
+ .bdrv_get_specific_stats = nvme_get_specific_stats,
162
163
.bdrv_detach_aio_context = nvme_detach_aio_context,
164
.bdrv_attach_aio_context = nvme_attach_aio_context,
71
--
165
--
72
2.21.0
166
2.26.2
73
167
74
diff view generated by jsdifflib
1
bdrv_has_zero_init() only has meaning for newly created images or image
1
From: Coiby Xu <coiby.xu@gmail.com>
2
areas. If the mirror job itself did not create the image, it cannot
3
rely on bdrv_has_zero_init()'s result to carry any meaning.
4
2
5
This is the case for drive-mirror with mode=existing and always for
3
Allow vu_message_read to be replaced by one which will make use of the
6
blockdev-mirror.
4
QIOChannel functions. Thus reading vhost-user message won't stall the
5
guest. For slave channel, we still use the default vu_message_read.
7
6
8
Note that we only have to zero-initialize the target with sync=full,
7
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
9
because other modes actually do not promise that the target will contain
8
Signed-off-by: Coiby Xu <coiby.xu@gmail.com>
10
the same data as the source after the job -- sync=top only promises to
9
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
11
copy anything allocated in the top layer, and sync=none will only copy
10
Message-id: 20200918080912.321299-2-coiby.xu@gmail.com
12
new I/O. (Which is how mirror has always handled it.)
11
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
12
---
13
contrib/libvhost-user/libvhost-user.h | 21 +++++++++++++++++++++
14
contrib/libvhost-user/libvhost-user-glib.c | 2 +-
15
contrib/libvhost-user/libvhost-user.c | 14 +++++++-------
16
tests/vhost-user-bridge.c | 2 ++
17
tools/virtiofsd/fuse_virtio.c | 4 ++--
18
5 files changed, 33 insertions(+), 10 deletions(-)
13
19
14
Signed-off-by: Max Reitz <mreitz@redhat.com>
20
diff --git a/contrib/libvhost-user/libvhost-user.h b/contrib/libvhost-user/libvhost-user.h
15
Message-id: 20190724171239.8764-3-mreitz@redhat.com
16
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
17
Signed-off-by: Max Reitz <mreitz@redhat.com>
18
---
19
include/block/block_int.h | 2 ++
20
block/mirror.c | 11 ++++++++---
21
blockdev.c | 16 +++++++++++++---
22
tests/test-block-iothread.c | 2 +-
23
4 files changed, 24 insertions(+), 7 deletions(-)
24
25
diff --git a/include/block/block_int.h b/include/block/block_int.h
26
index XXXXXXX..XXXXXXX 100644
21
index XXXXXXX..XXXXXXX 100644
27
--- a/include/block/block_int.h
22
--- a/contrib/libvhost-user/libvhost-user.h
28
+++ b/include/block/block_int.h
23
+++ b/contrib/libvhost-user/libvhost-user.h
29
@@ -XXX,XX +XXX,XX @@ BlockJob *commit_active_start(const char *job_id, BlockDriverState *bs,
24
@@ -XXX,XX +XXX,XX @@
30
* @buf_size: The amount of data that can be in flight at one time.
25
*/
31
* @mode: Whether to collapse all images in the chain to the target.
26
#define VHOST_USER_MAX_RAM_SLOTS 32
32
* @backing_mode: How to establish the target's backing chain after completion.
27
33
+ * @zero_target: Whether the target should be explicitly zero-initialized
28
+#define VHOST_USER_HDR_SIZE offsetof(VhostUserMsg, payload.u64)
34
* @on_source_error: The action to take upon error reading from the source.
29
+
35
* @on_target_error: The action to take upon error writing to the target.
30
typedef enum VhostSetConfigType {
36
* @unmap: Whether to unmap target where source sectors only contain zeroes.
31
VHOST_SET_CONFIG_TYPE_MASTER = 0,
37
@@ -XXX,XX +XXX,XX @@ void mirror_start(const char *job_id, BlockDriverState *bs,
32
VHOST_SET_CONFIG_TYPE_MIGRATION = 1,
38
int creation_flags, int64_t speed,
33
@@ -XXX,XX +XXX,XX @@ typedef uint64_t (*vu_get_features_cb) (VuDev *dev);
39
uint32_t granularity, int64_t buf_size,
34
typedef void (*vu_set_features_cb) (VuDev *dev, uint64_t features);
40
MirrorSyncMode mode, BlockMirrorBackingMode backing_mode,
35
typedef int (*vu_process_msg_cb) (VuDev *dev, VhostUserMsg *vmsg,
41
+ bool zero_target,
36
int *do_reply);
42
BlockdevOnError on_source_error,
37
+typedef bool (*vu_read_msg_cb) (VuDev *dev, int sock, VhostUserMsg *vmsg);
43
BlockdevOnError on_target_error,
38
typedef void (*vu_queue_set_started_cb) (VuDev *dev, int qidx, bool started);
44
bool unmap, const char *filter_node_name,
39
typedef bool (*vu_queue_is_processed_in_order_cb) (VuDev *dev, int qidx);
45
diff --git a/block/mirror.c b/block/mirror.c
40
typedef int (*vu_get_config_cb) (VuDev *dev, uint8_t *config, uint32_t len);
41
@@ -XXX,XX +XXX,XX @@ struct VuDev {
42
bool broken;
43
uint16_t max_queues;
44
45
+ /* @read_msg: custom method to read vhost-user message
46
+ *
47
+ * Read data from vhost_user socket fd and fill up
48
+ * the passed VhostUserMsg *vmsg struct.
49
+ *
50
+ * If reading fails, it should close the received set of file
51
+ * descriptors as socket message's auxiliary data.
52
+ *
53
+ * For the details, please refer to vu_message_read in libvhost-user.c
54
+ * which will be used by default if not custom method is provided when
55
+ * calling vu_init
56
+ *
57
+ * Returns: true if vhost-user message successfully received,
58
+ * otherwise return false.
59
+ *
60
+ */
61
+ vu_read_msg_cb read_msg;
62
/* @set_watch: add or update the given fd to the watch set,
63
* call cb when condition is met */
64
vu_set_watch_cb set_watch;
65
@@ -XXX,XX +XXX,XX @@ bool vu_init(VuDev *dev,
66
uint16_t max_queues,
67
int socket,
68
vu_panic_cb panic,
69
+ vu_read_msg_cb read_msg,
70
vu_set_watch_cb set_watch,
71
vu_remove_watch_cb remove_watch,
72
const VuDevIface *iface);
73
diff --git a/contrib/libvhost-user/libvhost-user-glib.c b/contrib/libvhost-user/libvhost-user-glib.c
46
index XXXXXXX..XXXXXXX 100644
74
index XXXXXXX..XXXXXXX 100644
47
--- a/block/mirror.c
75
--- a/contrib/libvhost-user/libvhost-user-glib.c
48
+++ b/block/mirror.c
76
+++ b/contrib/libvhost-user/libvhost-user-glib.c
49
@@ -XXX,XX +XXX,XX @@ typedef struct MirrorBlockJob {
77
@@ -XXX,XX +XXX,XX @@ vug_init(VugDev *dev, uint16_t max_queues, int socket,
50
Error *replace_blocker;
78
g_assert(dev);
51
bool is_none_mode;
79
g_assert(iface);
52
BlockMirrorBackingMode backing_mode;
80
53
+ /* Whether the target image requires explicit zero-initialization */
81
- if (!vu_init(&dev->parent, max_queues, socket, panic, set_watch,
54
+ bool zero_target;
82
+ if (!vu_init(&dev->parent, max_queues, socket, panic, NULL, set_watch,
55
MirrorCopyMode copy_mode;
83
remove_watch, iface)) {
56
BlockdevOnError on_source_error, on_target_error;
84
return false;
57
bool synced;
85
}
58
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn mirror_dirty_init(MirrorBlockJob *s)
86
diff --git a/contrib/libvhost-user/libvhost-user.c b/contrib/libvhost-user/libvhost-user.c
59
int ret;
60
int64_t count;
61
62
- if (base == NULL && !bdrv_has_zero_init(target_bs)) {
63
+ if (s->zero_target) {
64
if (!bdrv_can_write_zeroes_with_unmap(target_bs)) {
65
bdrv_set_dirty_bitmap(s->dirty_bitmap, 0, s->bdev_length);
66
return 0;
67
@@ -XXX,XX +XXX,XX @@ static BlockJob *mirror_start_job(
68
const char *replaces, int64_t speed,
69
uint32_t granularity, int64_t buf_size,
70
BlockMirrorBackingMode backing_mode,
71
+ bool zero_target,
72
BlockdevOnError on_source_error,
73
BlockdevOnError on_target_error,
74
bool unmap,
75
@@ -XXX,XX +XXX,XX @@ static BlockJob *mirror_start_job(
76
s->on_target_error = on_target_error;
77
s->is_none_mode = is_none_mode;
78
s->backing_mode = backing_mode;
79
+ s->zero_target = zero_target;
80
s->copy_mode = copy_mode;
81
s->base = base;
82
s->granularity = granularity;
83
@@ -XXX,XX +XXX,XX @@ void mirror_start(const char *job_id, BlockDriverState *bs,
84
int creation_flags, int64_t speed,
85
uint32_t granularity, int64_t buf_size,
86
MirrorSyncMode mode, BlockMirrorBackingMode backing_mode,
87
+ bool zero_target,
88
BlockdevOnError on_source_error,
89
BlockdevOnError on_target_error,
90
bool unmap, const char *filter_node_name,
91
@@ -XXX,XX +XXX,XX @@ void mirror_start(const char *job_id, BlockDriverState *bs,
92
is_none_mode = mode == MIRROR_SYNC_MODE_NONE;
93
base = mode == MIRROR_SYNC_MODE_TOP ? backing_bs(bs) : NULL;
94
mirror_start_job(job_id, bs, creation_flags, target, replaces,
95
- speed, granularity, buf_size, backing_mode,
96
+ speed, granularity, buf_size, backing_mode, zero_target,
97
on_source_error, on_target_error, unmap, NULL, NULL,
98
&mirror_job_driver, is_none_mode, base, false,
99
filter_node_name, true, copy_mode, errp);
100
@@ -XXX,XX +XXX,XX @@ BlockJob *commit_active_start(const char *job_id, BlockDriverState *bs,
101
102
ret = mirror_start_job(
103
job_id, bs, creation_flags, base, NULL, speed, 0, 0,
104
- MIRROR_LEAVE_BACKING_CHAIN,
105
+ MIRROR_LEAVE_BACKING_CHAIN, false,
106
on_error, on_error, true, cb, opaque,
107
&commit_active_job_driver, false, base, auto_complete,
108
filter_node_name, false, MIRROR_COPY_MODE_BACKGROUND,
109
diff --git a/blockdev.c b/blockdev.c
110
index XXXXXXX..XXXXXXX 100644
87
index XXXXXXX..XXXXXXX 100644
111
--- a/blockdev.c
88
--- a/contrib/libvhost-user/libvhost-user.c
112
+++ b/blockdev.c
89
+++ b/contrib/libvhost-user/libvhost-user.c
113
@@ -XXX,XX +XXX,XX @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
90
@@ -XXX,XX +XXX,XX @@
114
bool has_replaces, const char *replaces,
91
/* The version of inflight buffer */
115
enum MirrorSyncMode sync,
92
#define INFLIGHT_VERSION 1
116
BlockMirrorBackingMode backing_mode,
93
117
+ bool zero_target,
94
-#define VHOST_USER_HDR_SIZE offsetof(VhostUserMsg, payload.u64)
118
bool has_speed, int64_t speed,
95
-
119
bool has_granularity, uint32_t granularity,
96
/* The version of the protocol we support */
120
bool has_buf_size, int64_t buf_size,
97
#define VHOST_USER_VERSION 1
121
@@ -XXX,XX +XXX,XX @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
98
#define LIBVHOST_USER_DEBUG 0
122
*/
99
@@ -XXX,XX +XXX,XX @@ have_userfault(void)
123
mirror_start(job_id, bs, target,
124
has_replaces ? replaces : NULL, job_flags,
125
- speed, granularity, buf_size, sync, backing_mode,
126
+ speed, granularity, buf_size, sync, backing_mode, zero_target,
127
on_source_error, on_target_error, unmap, filter_node_name,
128
copy_mode, errp);
129
}
100
}
130
@@ -XXX,XX +XXX,XX @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
101
131
int flags;
102
static bool
132
int64_t size;
103
-vu_message_read(VuDev *dev, int conn_fd, VhostUserMsg *vmsg)
133
const char *format = arg->format;
104
+vu_message_read_default(VuDev *dev, int conn_fd, VhostUserMsg *vmsg)
134
+ bool zero_target;
105
{
135
int ret;
106
char control[CMSG_SPACE(VHOST_MEMORY_BASELINE_NREGIONS * sizeof(int))] = {};
136
107
struct iovec iov = {
137
bs = qmp_get_root_bs(arg->device, errp);
108
@@ -XXX,XX +XXX,XX @@ vu_process_message_reply(VuDev *dev, const VhostUserMsg *vmsg)
138
@@ -XXX,XX +XXX,XX @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
139
goto out;
109
goto out;
140
}
110
}
141
111
142
+ zero_target = (arg->sync == MIRROR_SYNC_MODE_FULL &&
112
- if (!vu_message_read(dev, dev->slave_fd, &msg_reply)) {
143
+ (arg->mode == NEW_IMAGE_MODE_EXISTING ||
113
+ if (!vu_message_read_default(dev, dev->slave_fd, &msg_reply)) {
144
+ !bdrv_has_zero_init(target_bs)));
114
goto out;
145
+
115
}
146
ret = bdrv_try_set_aio_context(target_bs, aio_context, errp);
116
147
if (ret < 0) {
117
@@ -XXX,XX +XXX,XX @@ vu_set_mem_table_exec_postcopy(VuDev *dev, VhostUserMsg *vmsg)
148
bdrv_unref(target_bs);
118
/* Wait for QEMU to confirm that it's registered the handler for the
149
@@ -XXX,XX +XXX,XX @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
119
* faults.
150
120
*/
151
blockdev_mirror_common(arg->has_job_id ? arg->job_id : NULL, bs, target_bs,
121
- if (!vu_message_read(dev, dev->sock, vmsg) ||
152
arg->has_replaces, arg->replaces, arg->sync,
122
+ if (!dev->read_msg(dev, dev->sock, vmsg) ||
153
- backing_mode, arg->has_speed, arg->speed,
123
vmsg->size != sizeof(vmsg->payload.u64) ||
154
+ backing_mode, zero_target,
124
vmsg->payload.u64 != 0) {
155
+ arg->has_speed, arg->speed,
125
vu_panic(dev, "failed to receive valid ack for postcopy set-mem-table");
156
arg->has_granularity, arg->granularity,
126
@@ -XXX,XX +XXX,XX @@ vu_dispatch(VuDev *dev)
157
arg->has_buf_size, arg->buf_size,
127
int reply_requested;
158
arg->has_on_source_error, arg->on_source_error,
128
bool need_reply, success = false;
159
@@ -XXX,XX +XXX,XX @@ void qmp_blockdev_mirror(bool has_job_id, const char *job_id,
129
160
AioContext *aio_context;
130
- if (!vu_message_read(dev, dev->sock, &vmsg)) {
161
BlockMirrorBackingMode backing_mode = MIRROR_LEAVE_BACKING_CHAIN;
131
+ if (!dev->read_msg(dev, dev->sock, &vmsg)) {
162
Error *local_err = NULL;
132
goto end;
163
+ bool zero_target;
133
}
164
int ret;
134
165
135
@@ -XXX,XX +XXX,XX @@ vu_init(VuDev *dev,
166
bs = qmp_get_root_bs(device, errp);
136
uint16_t max_queues,
167
@@ -XXX,XX +XXX,XX @@ void qmp_blockdev_mirror(bool has_job_id, const char *job_id,
137
int socket,
138
vu_panic_cb panic,
139
+ vu_read_msg_cb read_msg,
140
vu_set_watch_cb set_watch,
141
vu_remove_watch_cb remove_watch,
142
const VuDevIface *iface)
143
@@ -XXX,XX +XXX,XX @@ vu_init(VuDev *dev,
144
145
dev->sock = socket;
146
dev->panic = panic;
147
+ dev->read_msg = read_msg ? read_msg : vu_message_read_default;
148
dev->set_watch = set_watch;
149
dev->remove_watch = remove_watch;
150
dev->iface = iface;
151
@@ -XXX,XX +XXX,XX @@ static void _vu_queue_notify(VuDev *dev, VuVirtq *vq, bool sync)
152
153
vu_message_write(dev, dev->slave_fd, &vmsg);
154
if (ack) {
155
- vu_message_read(dev, dev->slave_fd, &vmsg);
156
+ vu_message_read_default(dev, dev->slave_fd, &vmsg);
157
}
168
return;
158
return;
169
}
159
}
170
160
diff --git a/tests/vhost-user-bridge.c b/tests/vhost-user-bridge.c
171
+ zero_target = (sync == MIRROR_SYNC_MODE_FULL);
172
+
173
aio_context = bdrv_get_aio_context(bs);
174
aio_context_acquire(aio_context);
175
176
@@ -XXX,XX +XXX,XX @@ void qmp_blockdev_mirror(bool has_job_id, const char *job_id,
177
178
blockdev_mirror_common(has_job_id ? job_id : NULL, bs, target_bs,
179
has_replaces, replaces, sync, backing_mode,
180
- has_speed, speed,
181
+ zero_target, has_speed, speed,
182
has_granularity, granularity,
183
has_buf_size, buf_size,
184
has_on_source_error, on_source_error,
185
diff --git a/tests/test-block-iothread.c b/tests/test-block-iothread.c
186
index XXXXXXX..XXXXXXX 100644
161
index XXXXXXX..XXXXXXX 100644
187
--- a/tests/test-block-iothread.c
162
--- a/tests/vhost-user-bridge.c
188
+++ b/tests/test-block-iothread.c
163
+++ b/tests/vhost-user-bridge.c
189
@@ -XXX,XX +XXX,XX @@ static void test_propagate_mirror(void)
164
@@ -XXX,XX +XXX,XX @@ vubr_accept_cb(int sock, void *ctx)
190
165
VHOST_USER_BRIDGE_MAX_QUEUES,
191
/* Start a mirror job */
166
conn_fd,
192
mirror_start("job0", src, target, NULL, JOB_DEFAULT, 0, 0, 0,
167
vubr_panic,
193
- MIRROR_SYNC_MODE_NONE, MIRROR_OPEN_BACKING_CHAIN,
168
+ NULL,
194
+ MIRROR_SYNC_MODE_NONE, MIRROR_OPEN_BACKING_CHAIN, false,
169
vubr_set_watch,
195
BLOCKDEV_ON_ERROR_REPORT, BLOCKDEV_ON_ERROR_REPORT,
170
vubr_remove_watch,
196
false, "filter_node", MIRROR_COPY_MODE_BACKGROUND,
171
&vuiface)) {
197
&error_abort);
172
@@ -XXX,XX +XXX,XX @@ vubr_new(const char *path, bool client)
173
VHOST_USER_BRIDGE_MAX_QUEUES,
174
dev->sock,
175
vubr_panic,
176
+ NULL,
177
vubr_set_watch,
178
vubr_remove_watch,
179
&vuiface)) {
180
diff --git a/tools/virtiofsd/fuse_virtio.c b/tools/virtiofsd/fuse_virtio.c
181
index XXXXXXX..XXXXXXX 100644
182
--- a/tools/virtiofsd/fuse_virtio.c
183
+++ b/tools/virtiofsd/fuse_virtio.c
184
@@ -XXX,XX +XXX,XX @@ int virtio_session_mount(struct fuse_session *se)
185
se->vu_socketfd = data_sock;
186
se->virtio_dev->se = se;
187
pthread_rwlock_init(&se->virtio_dev->vu_dispatch_rwlock, NULL);
188
- vu_init(&se->virtio_dev->dev, 2, se->vu_socketfd, fv_panic, fv_set_watch,
189
- fv_remove_watch, &fv_iface);
190
+ vu_init(&se->virtio_dev->dev, 2, se->vu_socketfd, fv_panic, NULL,
191
+ fv_set_watch, fv_remove_watch, &fv_iface);
192
193
return 0;
194
}
198
--
195
--
199
2.21.0
196
2.26.2
200
197
201
diff view generated by jsdifflib
New patch
1
From: Coiby Xu <coiby.xu@gmail.com>
1
2
3
When the client is running in gdb and quit command is run in gdb,
4
QEMU will still dispatch the event which will cause segment fault in
5
the callback function.
6
7
Signed-off-by: Coiby Xu <coiby.xu@gmail.com>
8
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
9
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
10
Message-id: 20200918080912.321299-3-coiby.xu@gmail.com
11
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
12
---
13
contrib/libvhost-user/libvhost-user.c | 1 +
14
1 file changed, 1 insertion(+)
15
16
diff --git a/contrib/libvhost-user/libvhost-user.c b/contrib/libvhost-user/libvhost-user.c
17
index XXXXXXX..XXXXXXX 100644
18
--- a/contrib/libvhost-user/libvhost-user.c
19
+++ b/contrib/libvhost-user/libvhost-user.c
20
@@ -XXX,XX +XXX,XX @@ vu_deinit(VuDev *dev)
21
}
22
23
if (vq->kick_fd != -1) {
24
+ dev->remove_watch(dev, vq->kick_fd);
25
close(vq->kick_fd);
26
vq->kick_fd = -1;
27
}
28
--
29
2.26.2
30
diff view generated by jsdifflib
New patch
1
From: Coiby Xu <coiby.xu@gmail.com>
1
2
3
Sharing QEMU devices via vhost-user protocol.
4
5
Only one vhost-user client can connect to the server one time.
6
7
Suggested-by: Kevin Wolf <kwolf@redhat.com>
8
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
9
Signed-off-by: Coiby Xu <coiby.xu@gmail.com>
10
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
11
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
12
Message-id: 20200918080912.321299-4-coiby.xu@gmail.com
13
[Fixed size_t %lu -> %zu format string compiler error.
14
--Stefan]
15
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
16
---
17
util/vhost-user-server.h | 65 ++++++
18
util/vhost-user-server.c | 428 +++++++++++++++++++++++++++++++++++++++
19
util/meson.build | 1 +
20
3 files changed, 494 insertions(+)
21
create mode 100644 util/vhost-user-server.h
22
create mode 100644 util/vhost-user-server.c
23
24
diff --git a/util/vhost-user-server.h b/util/vhost-user-server.h
25
new file mode 100644
26
index XXXXXXX..XXXXXXX
27
--- /dev/null
28
+++ b/util/vhost-user-server.h
29
@@ -XXX,XX +XXX,XX @@
30
+/*
31
+ * Sharing QEMU devices via vhost-user protocol
32
+ *
33
+ * Copyright (c) Coiby Xu <coiby.xu@gmail.com>.
34
+ * Copyright (c) 2020 Red Hat, Inc.
35
+ *
36
+ * This work is licensed under the terms of the GNU GPL, version 2 or
37
+ * later. See the COPYING file in the top-level directory.
38
+ */
39
+
40
+#ifndef VHOST_USER_SERVER_H
41
+#define VHOST_USER_SERVER_H
42
+
43
+#include "contrib/libvhost-user/libvhost-user.h"
44
+#include "io/channel-socket.h"
45
+#include "io/channel-file.h"
46
+#include "io/net-listener.h"
47
+#include "qemu/error-report.h"
48
+#include "qapi/error.h"
49
+#include "standard-headers/linux/virtio_blk.h"
50
+
51
+typedef struct VuFdWatch {
52
+ VuDev *vu_dev;
53
+ int fd; /*kick fd*/
54
+ void *pvt;
55
+ vu_watch_cb cb;
56
+ bool processing;
57
+ QTAILQ_ENTRY(VuFdWatch) next;
58
+} VuFdWatch;
59
+
60
+typedef struct VuServer VuServer;
61
+typedef void DevicePanicNotifierFn(VuServer *server);
62
+
63
+struct VuServer {
64
+ QIONetListener *listener;
65
+ AioContext *ctx;
66
+ DevicePanicNotifierFn *device_panic_notifier;
67
+ int max_queues;
68
+ const VuDevIface *vu_iface;
69
+ VuDev vu_dev;
70
+ QIOChannel *ioc; /* The I/O channel with the client */
71
+ QIOChannelSocket *sioc; /* The underlying data channel with the client */
72
+ /* IOChannel for fd provided via VHOST_USER_SET_SLAVE_REQ_FD */
73
+ QIOChannel *ioc_slave;
74
+ QIOChannelSocket *sioc_slave;
75
+ Coroutine *co_trip; /* coroutine for processing VhostUserMsg */
76
+ QTAILQ_HEAD(, VuFdWatch) vu_fd_watches;
77
+ /* restart coroutine co_trip if AIOContext is changed */
78
+ bool aio_context_changed;
79
+ bool processing_msg;
80
+};
81
+
82
+bool vhost_user_server_start(VuServer *server,
83
+ SocketAddress *unix_socket,
84
+ AioContext *ctx,
85
+ uint16_t max_queues,
86
+ DevicePanicNotifierFn *device_panic_notifier,
87
+ const VuDevIface *vu_iface,
88
+ Error **errp);
89
+
90
+void vhost_user_server_stop(VuServer *server);
91
+
92
+void vhost_user_server_set_aio_context(VuServer *server, AioContext *ctx);
93
+
94
+#endif /* VHOST_USER_SERVER_H */
95
diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
96
new file mode 100644
97
index XXXXXXX..XXXXXXX
98
--- /dev/null
99
+++ b/util/vhost-user-server.c
100
@@ -XXX,XX +XXX,XX @@
101
+/*
102
+ * Sharing QEMU devices via vhost-user protocol
103
+ *
104
+ * Copyright (c) Coiby Xu <coiby.xu@gmail.com>.
105
+ * Copyright (c) 2020 Red Hat, Inc.
106
+ *
107
+ * This work is licensed under the terms of the GNU GPL, version 2 or
108
+ * later. See the COPYING file in the top-level directory.
109
+ */
110
+#include "qemu/osdep.h"
111
+#include "qemu/main-loop.h"
112
+#include "vhost-user-server.h"
113
+
114
+static void vmsg_close_fds(VhostUserMsg *vmsg)
115
+{
116
+ int i;
117
+ for (i = 0; i < vmsg->fd_num; i++) {
118
+ close(vmsg->fds[i]);
119
+ }
120
+}
121
+
122
+static void vmsg_unblock_fds(VhostUserMsg *vmsg)
123
+{
124
+ int i;
125
+ for (i = 0; i < vmsg->fd_num; i++) {
126
+ qemu_set_nonblock(vmsg->fds[i]);
127
+ }
128
+}
129
+
130
+static void vu_accept(QIONetListener *listener, QIOChannelSocket *sioc,
131
+ gpointer opaque);
132
+
133
+static void close_client(VuServer *server)
134
+{
135
+ /*
136
+ * Before closing the client
137
+ *
138
+ * 1. Let vu_client_trip stop processing new vhost-user msg
139
+ *
140
+ * 2. remove kick_handler
141
+ *
142
+ * 3. wait for the kick handler to be finished
143
+ *
144
+ * 4. wait for the current vhost-user msg to be finished processing
145
+ */
146
+
147
+ QIOChannelSocket *sioc = server->sioc;
148
+ /* When this is set vu_client_trip will stop new processing vhost-user message */
149
+ server->sioc = NULL;
150
+
151
+ VuFdWatch *vu_fd_watch, *next;
152
+ QTAILQ_FOREACH_SAFE(vu_fd_watch, &server->vu_fd_watches, next, next) {
153
+ aio_set_fd_handler(server->ioc->ctx, vu_fd_watch->fd, true, NULL,
154
+ NULL, NULL, NULL);
155
+ }
156
+
157
+ while (!QTAILQ_EMPTY(&server->vu_fd_watches)) {
158
+ QTAILQ_FOREACH_SAFE(vu_fd_watch, &server->vu_fd_watches, next, next) {
159
+ if (!vu_fd_watch->processing) {
160
+ QTAILQ_REMOVE(&server->vu_fd_watches, vu_fd_watch, next);
161
+ g_free(vu_fd_watch);
162
+ }
163
+ }
164
+ }
165
+
166
+ while (server->processing_msg) {
167
+ if (server->ioc->read_coroutine) {
168
+ server->ioc->read_coroutine = NULL;
169
+ qio_channel_set_aio_fd_handler(server->ioc, server->ioc->ctx, NULL,
170
+ NULL, server->ioc);
171
+ server->processing_msg = false;
172
+ }
173
+ }
174
+
175
+ vu_deinit(&server->vu_dev);
176
+ object_unref(OBJECT(sioc));
177
+ object_unref(OBJECT(server->ioc));
178
+}
179
+
180
+static void panic_cb(VuDev *vu_dev, const char *buf)
181
+{
182
+ VuServer *server = container_of(vu_dev, VuServer, vu_dev);
183
+
184
+ /* avoid while loop in close_client */
185
+ server->processing_msg = false;
186
+
187
+ if (buf) {
188
+ error_report("vu_panic: %s", buf);
189
+ }
190
+
191
+ if (server->sioc) {
192
+ close_client(server);
193
+ }
194
+
195
+ if (server->device_panic_notifier) {
196
+ server->device_panic_notifier(server);
197
+ }
198
+
199
+ /*
200
+ * Set the callback function for network listener so another
201
+ * vhost-user client can connect to this server
202
+ */
203
+ qio_net_listener_set_client_func(server->listener,
204
+ vu_accept,
205
+ server,
206
+ NULL);
207
+}
208
+
209
+static bool coroutine_fn
210
+vu_message_read(VuDev *vu_dev, int conn_fd, VhostUserMsg *vmsg)
211
+{
212
+ struct iovec iov = {
213
+ .iov_base = (char *)vmsg,
214
+ .iov_len = VHOST_USER_HDR_SIZE,
215
+ };
216
+ int rc, read_bytes = 0;
217
+ Error *local_err = NULL;
218
+ /*
219
+ * Store fds/nfds returned from qio_channel_readv_full into
220
+ * temporary variables.
221
+ *
222
+ * VhostUserMsg is a packed structure, gcc will complain about passing
223
+ * pointer to a packed structure member if we pass &VhostUserMsg.fd_num
224
+ * and &VhostUserMsg.fds directly when calling qio_channel_readv_full,
225
+ * thus two temporary variables nfds and fds are used here.
226
+ */
227
+ size_t nfds = 0, nfds_t = 0;
228
+ const size_t max_fds = G_N_ELEMENTS(vmsg->fds);
229
+ int *fds_t = NULL;
230
+ VuServer *server = container_of(vu_dev, VuServer, vu_dev);
231
+ QIOChannel *ioc = server->ioc;
232
+
233
+ if (!ioc) {
234
+ error_report_err(local_err);
235
+ goto fail;
236
+ }
237
+
238
+ assert(qemu_in_coroutine());
239
+ do {
240
+ /*
241
+ * qio_channel_readv_full may have short reads, keeping calling it
242
+ * until getting VHOST_USER_HDR_SIZE or 0 bytes in total
243
+ */
244
+ rc = qio_channel_readv_full(ioc, &iov, 1, &fds_t, &nfds_t, &local_err);
245
+ if (rc < 0) {
246
+ if (rc == QIO_CHANNEL_ERR_BLOCK) {
247
+ qio_channel_yield(ioc, G_IO_IN);
248
+ continue;
249
+ } else {
250
+ error_report_err(local_err);
251
+ return false;
252
+ }
253
+ }
254
+ read_bytes += rc;
255
+ if (nfds_t > 0) {
256
+ if (nfds + nfds_t > max_fds) {
257
+ error_report("A maximum of %zu fds are allowed, "
258
+ "however got %zu fds now",
259
+ max_fds, nfds + nfds_t);
260
+ goto fail;
261
+ }
262
+ memcpy(vmsg->fds + nfds, fds_t,
263
+ nfds_t *sizeof(vmsg->fds[0]));
264
+ nfds += nfds_t;
265
+ g_free(fds_t);
266
+ }
267
+ if (read_bytes == VHOST_USER_HDR_SIZE || rc == 0) {
268
+ break;
269
+ }
270
+ iov.iov_base = (char *)vmsg + read_bytes;
271
+ iov.iov_len = VHOST_USER_HDR_SIZE - read_bytes;
272
+ } while (true);
273
+
274
+ vmsg->fd_num = nfds;
275
+ /* qio_channel_readv_full will make socket fds blocking, unblock them */
276
+ vmsg_unblock_fds(vmsg);
277
+ if (vmsg->size > sizeof(vmsg->payload)) {
278
+ error_report("Error: too big message request: %d, "
279
+ "size: vmsg->size: %u, "
280
+ "while sizeof(vmsg->payload) = %zu",
281
+ vmsg->request, vmsg->size, sizeof(vmsg->payload));
282
+ goto fail;
283
+ }
284
+
285
+ struct iovec iov_payload = {
286
+ .iov_base = (char *)&vmsg->payload,
287
+ .iov_len = vmsg->size,
288
+ };
289
+ if (vmsg->size) {
290
+ rc = qio_channel_readv_all_eof(ioc, &iov_payload, 1, &local_err);
291
+ if (rc == -1) {
292
+ error_report_err(local_err);
293
+ goto fail;
294
+ }
295
+ }
296
+
297
+ return true;
298
+
299
+fail:
300
+ vmsg_close_fds(vmsg);
301
+
302
+ return false;
303
+}
304
+
305
+
306
+static void vu_client_start(VuServer *server);
307
+static coroutine_fn void vu_client_trip(void *opaque)
308
+{
309
+ VuServer *server = opaque;
310
+
311
+ while (!server->aio_context_changed && server->sioc) {
312
+ server->processing_msg = true;
313
+ vu_dispatch(&server->vu_dev);
314
+ server->processing_msg = false;
315
+ }
316
+
317
+ if (server->aio_context_changed && server->sioc) {
318
+ server->aio_context_changed = false;
319
+ vu_client_start(server);
320
+ }
321
+}
322
+
323
+static void vu_client_start(VuServer *server)
324
+{
325
+ server->co_trip = qemu_coroutine_create(vu_client_trip, server);
326
+ aio_co_enter(server->ctx, server->co_trip);
327
+}
328
+
329
+/*
330
+ * a wrapper for vu_kick_cb
331
+ *
332
+ * since aio_dispatch can only pass one user data pointer to the
333
+ * callback function, pack VuDev and pvt into a struct. Then unpack it
334
+ * and pass them to vu_kick_cb
335
+ */
336
+static void kick_handler(void *opaque)
337
+{
338
+ VuFdWatch *vu_fd_watch = opaque;
339
+ vu_fd_watch->processing = true;
340
+ vu_fd_watch->cb(vu_fd_watch->vu_dev, 0, vu_fd_watch->pvt);
341
+ vu_fd_watch->processing = false;
342
+}
343
+
344
+
345
+static VuFdWatch *find_vu_fd_watch(VuServer *server, int fd)
346
+{
347
+
348
+ VuFdWatch *vu_fd_watch, *next;
349
+ QTAILQ_FOREACH_SAFE(vu_fd_watch, &server->vu_fd_watches, next, next) {
350
+ if (vu_fd_watch->fd == fd) {
351
+ return vu_fd_watch;
352
+ }
353
+ }
354
+ return NULL;
355
+}
356
+
357
+static void
358
+set_watch(VuDev *vu_dev, int fd, int vu_evt,
359
+ vu_watch_cb cb, void *pvt)
360
+{
361
+
362
+ VuServer *server = container_of(vu_dev, VuServer, vu_dev);
363
+ g_assert(vu_dev);
364
+ g_assert(fd >= 0);
365
+ g_assert(cb);
366
+
367
+ VuFdWatch *vu_fd_watch = find_vu_fd_watch(server, fd);
368
+
369
+ if (!vu_fd_watch) {
370
+ VuFdWatch *vu_fd_watch = g_new0(VuFdWatch, 1);
371
+
372
+ QTAILQ_INSERT_TAIL(&server->vu_fd_watches, vu_fd_watch, next);
373
+
374
+ vu_fd_watch->fd = fd;
375
+ vu_fd_watch->cb = cb;
376
+ qemu_set_nonblock(fd);
377
+ aio_set_fd_handler(server->ioc->ctx, fd, true, kick_handler,
378
+ NULL, NULL, vu_fd_watch);
379
+ vu_fd_watch->vu_dev = vu_dev;
380
+ vu_fd_watch->pvt = pvt;
381
+ }
382
+}
383
+
384
+
385
+static void remove_watch(VuDev *vu_dev, int fd)
386
+{
387
+ VuServer *server;
388
+ g_assert(vu_dev);
389
+ g_assert(fd >= 0);
390
+
391
+ server = container_of(vu_dev, VuServer, vu_dev);
392
+
393
+ VuFdWatch *vu_fd_watch = find_vu_fd_watch(server, fd);
394
+
395
+ if (!vu_fd_watch) {
396
+ return;
397
+ }
398
+ aio_set_fd_handler(server->ioc->ctx, fd, true, NULL, NULL, NULL, NULL);
399
+
400
+ QTAILQ_REMOVE(&server->vu_fd_watches, vu_fd_watch, next);
401
+ g_free(vu_fd_watch);
402
+}
403
+
404
+
405
+static void vu_accept(QIONetListener *listener, QIOChannelSocket *sioc,
406
+ gpointer opaque)
407
+{
408
+ VuServer *server = opaque;
409
+
410
+ if (server->sioc) {
411
+ warn_report("Only one vhost-user client is allowed to "
412
+ "connect the server one time");
413
+ return;
414
+ }
415
+
416
+ if (!vu_init(&server->vu_dev, server->max_queues, sioc->fd, panic_cb,
417
+ vu_message_read, set_watch, remove_watch, server->vu_iface)) {
418
+ error_report("Failed to initialize libvhost-user");
419
+ return;
420
+ }
421
+
422
+ /*
423
+ * Unset the callback function for network listener to make another
424
+ * vhost-user client keeping waiting until this client disconnects
425
+ */
426
+ qio_net_listener_set_client_func(server->listener,
427
+ NULL,
428
+ NULL,
429
+ NULL);
430
+ server->sioc = sioc;
431
+ /*
432
+ * Increase the object reference, so sioc will not freed by
433
+ * qio_net_listener_channel_func which will call object_unref(OBJECT(sioc))
434
+ */
435
+ object_ref(OBJECT(server->sioc));
436
+ qio_channel_set_name(QIO_CHANNEL(sioc), "vhost-user client");
437
+ server->ioc = QIO_CHANNEL(sioc);
438
+ object_ref(OBJECT(server->ioc));
439
+ qio_channel_attach_aio_context(server->ioc, server->ctx);
440
+ qio_channel_set_blocking(QIO_CHANNEL(server->sioc), false, NULL);
441
+ vu_client_start(server);
442
+}
443
+
444
+
445
+void vhost_user_server_stop(VuServer *server)
446
+{
447
+ if (server->sioc) {
448
+ close_client(server);
449
+ }
450
+
451
+ if (server->listener) {
452
+ qio_net_listener_disconnect(server->listener);
453
+ object_unref(OBJECT(server->listener));
454
+ }
455
+
456
+}
457
+
458
+void vhost_user_server_set_aio_context(VuServer *server, AioContext *ctx)
459
+{
460
+ VuFdWatch *vu_fd_watch, *next;
461
+ void *opaque = NULL;
462
+ IOHandler *io_read = NULL;
463
+ bool attach;
464
+
465
+ server->ctx = ctx ? ctx : qemu_get_aio_context();
466
+
467
+ if (!server->sioc) {
468
+ /* not yet serving any client*/
469
+ return;
470
+ }
471
+
472
+ if (ctx) {
473
+ qio_channel_attach_aio_context(server->ioc, ctx);
474
+ server->aio_context_changed = true;
475
+ io_read = kick_handler;
476
+ attach = true;
477
+ } else {
478
+ qio_channel_detach_aio_context(server->ioc);
479
+ /* server->ioc->ctx keeps the old AioConext */
480
+ ctx = server->ioc->ctx;
481
+ attach = false;
482
+ }
483
+
484
+ QTAILQ_FOREACH_SAFE(vu_fd_watch, &server->vu_fd_watches, next, next) {
485
+ if (vu_fd_watch->cb) {
486
+ opaque = attach ? vu_fd_watch : NULL;
487
+ aio_set_fd_handler(ctx, vu_fd_watch->fd, true,
488
+ io_read, NULL, NULL,
489
+ opaque);
490
+ }
491
+ }
492
+}
493
+
494
+
495
+bool vhost_user_server_start(VuServer *server,
496
+ SocketAddress *socket_addr,
497
+ AioContext *ctx,
498
+ uint16_t max_queues,
499
+ DevicePanicNotifierFn *device_panic_notifier,
500
+ const VuDevIface *vu_iface,
501
+ Error **errp)
502
+{
503
+ QIONetListener *listener = qio_net_listener_new();
504
+ if (qio_net_listener_open_sync(listener, socket_addr, 1,
505
+ errp) < 0) {
506
+ object_unref(OBJECT(listener));
507
+ return false;
508
+ }
509
+
510
+ /* zero out unspecified fileds */
511
+ *server = (VuServer) {
512
+ .listener = listener,
513
+ .vu_iface = vu_iface,
514
+ .max_queues = max_queues,
515
+ .ctx = ctx,
516
+ .device_panic_notifier = device_panic_notifier,
517
+ };
518
+
519
+ qio_net_listener_set_name(server->listener, "vhost-user-backend-listener");
520
+
521
+ qio_net_listener_set_client_func(server->listener,
522
+ vu_accept,
523
+ server,
524
+ NULL);
525
+
526
+ QTAILQ_INIT(&server->vu_fd_watches);
527
+ return true;
528
+}
529
diff --git a/util/meson.build b/util/meson.build
530
index XXXXXXX..XXXXXXX 100644
531
--- a/util/meson.build
532
+++ b/util/meson.build
533
@@ -XXX,XX +XXX,XX @@ if have_block
534
util_ss.add(files('main-loop.c'))
535
util_ss.add(files('nvdimm-utils.c'))
536
util_ss.add(files('qemu-coroutine.c', 'qemu-coroutine-lock.c', 'qemu-coroutine-io.c'))
537
+ util_ss.add(when: 'CONFIG_LINUX', if_true: files('vhost-user-server.c'))
538
util_ss.add(files('qemu-coroutine-sleep.c'))
539
util_ss.add(files('qemu-co-shared-resource.c'))
540
util_ss.add(files('thread-pool.c', 'qemu-timer.c'))
541
--
542
2.26.2
543
diff view generated by jsdifflib
1
If a qcow2 file is preallocated, it can no longer guarantee that it
1
From: Coiby Xu <coiby.xu@gmail.com>
2
initially appears as filled with zeroes.
3
2
4
So implement .bdrv_has_zero_init() by checking whether the file is
3
Move the constants from hw/core/qdev-properties.c to
5
preallocated; if so, forward the call to the underlying storage node,
4
util/block-helpers.h so that knowledge of the min/max values is
6
except for when it is encrypted: Encrypted preallocated images always
7
return effectively random data, so .bdrv_has_zero_init() must always
8
return 0 for them.
9
5
10
.bdrv_has_zero_init_truncate() can remain bdrv_has_zero_init_1(),
6
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
11
because it presupposes PREALLOC_MODE_OFF.
7
Signed-off-by: Coiby Xu <coiby.xu@gmail.com>
8
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
9
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
10
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
11
Message-id: 20200918080912.321299-5-coiby.xu@gmail.com
12
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
13
---
14
util/block-helpers.h | 19 +++++++++++++
15
hw/core/qdev-properties-system.c | 31 ++++-----------------
16
util/block-helpers.c | 46 ++++++++++++++++++++++++++++++++
17
util/meson.build | 1 +
18
4 files changed, 71 insertions(+), 26 deletions(-)
19
create mode 100644 util/block-helpers.h
20
create mode 100644 util/block-helpers.c
12
21
13
Reported-by: Stefano Garzarella <sgarzare@redhat.com>
22
diff --git a/util/block-helpers.h b/util/block-helpers.h
14
Signed-off-by: Max Reitz <mreitz@redhat.com>
23
new file mode 100644
15
Message-id: 20190724171239.8764-7-mreitz@redhat.com
24
index XXXXXXX..XXXXXXX
16
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
25
--- /dev/null
17
Signed-off-by: Max Reitz <mreitz@redhat.com>
26
+++ b/util/block-helpers.h
18
---
27
@@ -XXX,XX +XXX,XX @@
19
block/qcow2.c | 29 ++++++++++++++++++++++++++++-
28
+#ifndef BLOCK_HELPERS_H
20
1 file changed, 28 insertions(+), 1 deletion(-)
29
+#define BLOCK_HELPERS_H
21
30
+
22
diff --git a/block/qcow2.c b/block/qcow2.c
31
+#include "qemu/units.h"
32
+
33
+/* lower limit is sector size */
34
+#define MIN_BLOCK_SIZE INT64_C(512)
35
+#define MIN_BLOCK_SIZE_STR "512 B"
36
+/*
37
+ * upper limit is arbitrary, 2 MiB looks sufficient for all sensible uses, and
38
+ * matches qcow2 cluster size limit
39
+ */
40
+#define MAX_BLOCK_SIZE (2 * MiB)
41
+#define MAX_BLOCK_SIZE_STR "2 MiB"
42
+
43
+void check_block_size(const char *id, const char *name, int64_t value,
44
+ Error **errp);
45
+
46
+#endif /* BLOCK_HELPERS_H */
47
diff --git a/hw/core/qdev-properties-system.c b/hw/core/qdev-properties-system.c
23
index XXXXXXX..XXXXXXX 100644
48
index XXXXXXX..XXXXXXX 100644
24
--- a/block/qcow2.c
49
--- a/hw/core/qdev-properties-system.c
25
+++ b/block/qcow2.c
50
+++ b/hw/core/qdev-properties-system.c
26
@@ -XXX,XX +XXX,XX @@ static ImageInfoSpecific *qcow2_get_specific_info(BlockDriverState *bs,
51
@@ -XXX,XX +XXX,XX @@
27
return spec_info;
52
#include "sysemu/blockdev.h"
53
#include "net/net.h"
54
#include "hw/pci/pci.h"
55
+#include "util/block-helpers.h"
56
57
static bool check_prop_still_unset(DeviceState *dev, const char *name,
58
const void *old_val, const char *new_val,
59
@@ -XXX,XX +XXX,XX @@ const PropertyInfo qdev_prop_losttickpolicy = {
60
61
/* --- blocksize --- */
62
63
-/* lower limit is sector size */
64
-#define MIN_BLOCK_SIZE 512
65
-#define MIN_BLOCK_SIZE_STR "512 B"
66
-/*
67
- * upper limit is arbitrary, 2 MiB looks sufficient for all sensible uses, and
68
- * matches qcow2 cluster size limit
69
- */
70
-#define MAX_BLOCK_SIZE (2 * MiB)
71
-#define MAX_BLOCK_SIZE_STR "2 MiB"
72
-
73
static void set_blocksize(Object *obj, Visitor *v, const char *name,
74
void *opaque, Error **errp)
75
{
76
@@ -XXX,XX +XXX,XX @@ static void set_blocksize(Object *obj, Visitor *v, const char *name,
77
Property *prop = opaque;
78
uint32_t *ptr = qdev_get_prop_ptr(dev, prop);
79
uint64_t value;
80
+ Error *local_err = NULL;
81
82
if (dev->realized) {
83
qdev_prop_set_after_realize(dev, name, errp);
84
@@ -XXX,XX +XXX,XX @@ static void set_blocksize(Object *obj, Visitor *v, const char *name,
85
if (!visit_type_size(v, name, &value, errp)) {
86
return;
87
}
88
- /* value of 0 means "unset" */
89
- if (value && (value < MIN_BLOCK_SIZE || value > MAX_BLOCK_SIZE)) {
90
- error_setg(errp,
91
- "Property %s.%s doesn't take value %" PRIu64
92
- " (minimum: " MIN_BLOCK_SIZE_STR
93
- ", maximum: " MAX_BLOCK_SIZE_STR ")",
94
- dev->id ? : "", name, value);
95
+ check_block_size(dev->id ? : "", name, value, &local_err);
96
+ if (local_err) {
97
+ error_propagate(errp, local_err);
98
return;
99
}
100
-
101
- /* We rely on power-of-2 blocksizes for bitmasks */
102
- if ((value & (value - 1)) != 0) {
103
- error_setg(errp,
104
- "Property %s.%s doesn't take value '%" PRId64 "', "
105
- "it's not a power of 2", dev->id ?: "", name, (int64_t)value);
106
- return;
107
- }
108
-
109
*ptr = value;
28
}
110
}
29
111
30
+static int qcow2_has_zero_init(BlockDriverState *bs)
112
diff --git a/util/block-helpers.c b/util/block-helpers.c
113
new file mode 100644
114
index XXXXXXX..XXXXXXX
115
--- /dev/null
116
+++ b/util/block-helpers.c
117
@@ -XXX,XX +XXX,XX @@
118
+/*
119
+ * Block utility functions
120
+ *
121
+ * Copyright IBM, Corp. 2011
122
+ * Copyright (c) 2020 Coiby Xu <coiby.xu@gmail.com>
123
+ *
124
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
125
+ * See the COPYING file in the top-level directory.
126
+ */
127
+
128
+#include "qemu/osdep.h"
129
+#include "qapi/error.h"
130
+#include "qapi/qmp/qerror.h"
131
+#include "block-helpers.h"
132
+
133
+/**
134
+ * check_block_size:
135
+ * @id: The unique ID of the object
136
+ * @name: The name of the property being validated
137
+ * @value: The block size in bytes
138
+ * @errp: A pointer to an area to store an error
139
+ *
140
+ * This function checks that the block size meets the following conditions:
141
+ * 1. At least MIN_BLOCK_SIZE
142
+ * 2. No larger than MAX_BLOCK_SIZE
143
+ * 3. A power of 2
144
+ */
145
+void check_block_size(const char *id, const char *name, int64_t value,
146
+ Error **errp)
31
+{
147
+{
32
+ BDRVQcow2State *s = bs->opaque;
148
+ /* value of 0 means "unset" */
33
+ bool preallocated;
149
+ if (value && (value < MIN_BLOCK_SIZE || value > MAX_BLOCK_SIZE)) {
34
+
150
+ error_setg(errp, QERR_PROPERTY_VALUE_OUT_OF_RANGE,
35
+ if (qemu_in_coroutine()) {
151
+ id, name, value, MIN_BLOCK_SIZE, MAX_BLOCK_SIZE);
36
+ qemu_co_mutex_lock(&s->lock);
152
+ return;
37
+ }
38
+ /*
39
+ * Check preallocation status: Preallocated images have all L2
40
+ * tables allocated, nonpreallocated images have none. It is
41
+ * therefore enough to check the first one.
42
+ */
43
+ preallocated = s->l1_size > 0 && s->l1_table[0] != 0;
44
+ if (qemu_in_coroutine()) {
45
+ qemu_co_mutex_unlock(&s->lock);
46
+ }
153
+ }
47
+
154
+
48
+ if (!preallocated) {
155
+ /* We rely on power-of-2 blocksizes for bitmasks */
49
+ return 1;
156
+ if ((value & (value - 1)) != 0) {
50
+ } else if (bs->encrypted) {
157
+ error_setg(errp,
51
+ return 0;
158
+ "Property %s.%s doesn't take value '%" PRId64
52
+ } else {
159
+ "', it's not a power of 2",
53
+ return bdrv_has_zero_init(s->data_file->bs);
160
+ id, name, value);
161
+ return;
54
+ }
162
+ }
55
+}
163
+}
56
+
164
diff --git a/util/meson.build b/util/meson.build
57
static int qcow2_save_vmstate(BlockDriverState *bs, QEMUIOVector *qiov,
165
index XXXXXXX..XXXXXXX 100644
58
int64_t pos)
166
--- a/util/meson.build
59
{
167
+++ b/util/meson.build
60
@@ -XXX,XX +XXX,XX @@ BlockDriver bdrv_qcow2 = {
168
@@ -XXX,XX +XXX,XX @@ if have_block
61
.bdrv_child_perm = bdrv_format_default_perms,
169
util_ss.add(files('nvdimm-utils.c'))
62
.bdrv_co_create_opts = qcow2_co_create_opts,
170
util_ss.add(files('qemu-coroutine.c', 'qemu-coroutine-lock.c', 'qemu-coroutine-io.c'))
63
.bdrv_co_create = qcow2_co_create,
171
util_ss.add(when: 'CONFIG_LINUX', if_true: files('vhost-user-server.c'))
64
- .bdrv_has_zero_init = bdrv_has_zero_init_1,
172
+ util_ss.add(files('block-helpers.c'))
65
+ .bdrv_has_zero_init = qcow2_has_zero_init,
173
util_ss.add(files('qemu-coroutine-sleep.c'))
66
.bdrv_has_zero_init_truncate = bdrv_has_zero_init_1,
174
util_ss.add(files('qemu-co-shared-resource.c'))
67
.bdrv_co_block_status = qcow2_co_block_status,
175
util_ss.add(files('thread-pool.c', 'qemu-timer.c'))
68
69
--
176
--
70
2.21.0
177
2.26.2
71
178
72
diff view generated by jsdifflib
New patch
1
From: Coiby Xu <coiby.xu@gmail.com>
1
2
3
By making use of libvhost-user, block device drive can be shared to
4
the connected vhost-user client. Only one client can connect to the
5
server one time.
6
7
Since vhost-user-server needs a block drive to be created first, delay
8
the creation of this object.
9
10
Suggested-by: Kevin Wolf <kwolf@redhat.com>
11
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
12
Signed-off-by: Coiby Xu <coiby.xu@gmail.com>
13
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
14
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
15
Message-id: 20200918080912.321299-6-coiby.xu@gmail.com
16
[Shorten "vhost_user_blk_server" string to "vhost_user_blk" to avoid the
17
following compiler warning:
18
../block/export/vhost-user-blk-server.c:178:50: error: ‘%s’ directive output truncated writing 21 bytes into a region of size 20 [-Werror=format-truncation=]
19
and fix "Invalid size %ld ..." ssize_t format string arguments for
20
32-bit hosts.
21
--Stefan]
22
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
23
---
24
block/export/vhost-user-blk-server.h | 36 ++
25
block/export/vhost-user-blk-server.c | 661 +++++++++++++++++++++++++++
26
softmmu/vl.c | 4 +
27
block/meson.build | 1 +
28
4 files changed, 702 insertions(+)
29
create mode 100644 block/export/vhost-user-blk-server.h
30
create mode 100644 block/export/vhost-user-blk-server.c
31
32
diff --git a/block/export/vhost-user-blk-server.h b/block/export/vhost-user-blk-server.h
33
new file mode 100644
34
index XXXXXXX..XXXXXXX
35
--- /dev/null
36
+++ b/block/export/vhost-user-blk-server.h
37
@@ -XXX,XX +XXX,XX @@
38
+/*
39
+ * Sharing QEMU block devices via vhost-user protocal
40
+ *
41
+ * Copyright (c) Coiby Xu <coiby.xu@gmail.com>.
42
+ * Copyright (c) 2020 Red Hat, Inc.
43
+ *
44
+ * This work is licensed under the terms of the GNU GPL, version 2 or
45
+ * later. See the COPYING file in the top-level directory.
46
+ */
47
+
48
+#ifndef VHOST_USER_BLK_SERVER_H
49
+#define VHOST_USER_BLK_SERVER_H
50
+#include "util/vhost-user-server.h"
51
+
52
+typedef struct VuBlockDev VuBlockDev;
53
+#define TYPE_VHOST_USER_BLK_SERVER "vhost-user-blk-server"
54
+#define VHOST_USER_BLK_SERVER(obj) \
55
+ OBJECT_CHECK(VuBlockDev, obj, TYPE_VHOST_USER_BLK_SERVER)
56
+
57
+/* vhost user block device */
58
+struct VuBlockDev {
59
+ Object parent_obj;
60
+ char *node_name;
61
+ SocketAddress *addr;
62
+ AioContext *ctx;
63
+ VuServer vu_server;
64
+ bool running;
65
+ uint32_t blk_size;
66
+ BlockBackend *backend;
67
+ QIOChannelSocket *sioc;
68
+ QTAILQ_ENTRY(VuBlockDev) next;
69
+ struct virtio_blk_config blkcfg;
70
+ bool writable;
71
+};
72
+
73
+#endif /* VHOST_USER_BLK_SERVER_H */
74
diff --git a/block/export/vhost-user-blk-server.c b/block/export/vhost-user-blk-server.c
75
new file mode 100644
76
index XXXXXXX..XXXXXXX
77
--- /dev/null
78
+++ b/block/export/vhost-user-blk-server.c
79
@@ -XXX,XX +XXX,XX @@
80
+/*
81
+ * Sharing QEMU block devices via vhost-user protocal
82
+ *
83
+ * Parts of the code based on nbd/server.c.
84
+ *
85
+ * Copyright (c) Coiby Xu <coiby.xu@gmail.com>.
86
+ * Copyright (c) 2020 Red Hat, Inc.
87
+ *
88
+ * This work is licensed under the terms of the GNU GPL, version 2 or
89
+ * later. See the COPYING file in the top-level directory.
90
+ */
91
+#include "qemu/osdep.h"
92
+#include "block/block.h"
93
+#include "vhost-user-blk-server.h"
94
+#include "qapi/error.h"
95
+#include "qom/object_interfaces.h"
96
+#include "sysemu/block-backend.h"
97
+#include "util/block-helpers.h"
98
+
99
+enum {
100
+ VHOST_USER_BLK_MAX_QUEUES = 1,
101
+};
102
+struct virtio_blk_inhdr {
103
+ unsigned char status;
104
+};
105
+
106
+typedef struct VuBlockReq {
107
+ VuVirtqElement *elem;
108
+ int64_t sector_num;
109
+ size_t size;
110
+ struct virtio_blk_inhdr *in;
111
+ struct virtio_blk_outhdr out;
112
+ VuServer *server;
113
+ struct VuVirtq *vq;
114
+} VuBlockReq;
115
+
116
+static void vu_block_req_complete(VuBlockReq *req)
117
+{
118
+ VuDev *vu_dev = &req->server->vu_dev;
119
+
120
+ /* IO size with 1 extra status byte */
121
+ vu_queue_push(vu_dev, req->vq, req->elem, req->size + 1);
122
+ vu_queue_notify(vu_dev, req->vq);
123
+
124
+ if (req->elem) {
125
+ free(req->elem);
126
+ }
127
+
128
+ g_free(req);
129
+}
130
+
131
+static VuBlockDev *get_vu_block_device_by_server(VuServer *server)
132
+{
133
+ return container_of(server, VuBlockDev, vu_server);
134
+}
135
+
136
+static int coroutine_fn
137
+vu_block_discard_write_zeroes(VuBlockReq *req, struct iovec *iov,
138
+ uint32_t iovcnt, uint32_t type)
139
+{
140
+ struct virtio_blk_discard_write_zeroes desc;
141
+ ssize_t size = iov_to_buf(iov, iovcnt, 0, &desc, sizeof(desc));
142
+ if (unlikely(size != sizeof(desc))) {
143
+ error_report("Invalid size %zd, expect %zu", size, sizeof(desc));
144
+ return -EINVAL;
145
+ }
146
+
147
+ VuBlockDev *vdev_blk = get_vu_block_device_by_server(req->server);
148
+ uint64_t range[2] = { le64_to_cpu(desc.sector) << 9,
149
+ le32_to_cpu(desc.num_sectors) << 9 };
150
+ if (type == VIRTIO_BLK_T_DISCARD) {
151
+ if (blk_co_pdiscard(vdev_blk->backend, range[0], range[1]) == 0) {
152
+ return 0;
153
+ }
154
+ } else if (type == VIRTIO_BLK_T_WRITE_ZEROES) {
155
+ if (blk_co_pwrite_zeroes(vdev_blk->backend,
156
+ range[0], range[1], 0) == 0) {
157
+ return 0;
158
+ }
159
+ }
160
+
161
+ return -EINVAL;
162
+}
163
+
164
+static void coroutine_fn vu_block_flush(VuBlockReq *req)
165
+{
166
+ VuBlockDev *vdev_blk = get_vu_block_device_by_server(req->server);
167
+ BlockBackend *backend = vdev_blk->backend;
168
+ blk_co_flush(backend);
169
+}
170
+
171
+struct req_data {
172
+ VuServer *server;
173
+ VuVirtq *vq;
174
+ VuVirtqElement *elem;
175
+};
176
+
177
+static void coroutine_fn vu_block_virtio_process_req(void *opaque)
178
+{
179
+ struct req_data *data = opaque;
180
+ VuServer *server = data->server;
181
+ VuVirtq *vq = data->vq;
182
+ VuVirtqElement *elem = data->elem;
183
+ uint32_t type;
184
+ VuBlockReq *req;
185
+
186
+ VuBlockDev *vdev_blk = get_vu_block_device_by_server(server);
187
+ BlockBackend *backend = vdev_blk->backend;
188
+
189
+ struct iovec *in_iov = elem->in_sg;
190
+ struct iovec *out_iov = elem->out_sg;
191
+ unsigned in_num = elem->in_num;
192
+ unsigned out_num = elem->out_num;
193
+ /* refer to hw/block/virtio_blk.c */
194
+ if (elem->out_num < 1 || elem->in_num < 1) {
195
+ error_report("virtio-blk request missing headers");
196
+ free(elem);
197
+ return;
198
+ }
199
+
200
+ req = g_new0(VuBlockReq, 1);
201
+ req->server = server;
202
+ req->vq = vq;
203
+ req->elem = elem;
204
+
205
+ if (unlikely(iov_to_buf(out_iov, out_num, 0, &req->out,
206
+ sizeof(req->out)) != sizeof(req->out))) {
207
+ error_report("virtio-blk request outhdr too short");
208
+ goto err;
209
+ }
210
+
211
+ iov_discard_front(&out_iov, &out_num, sizeof(req->out));
212
+
213
+ if (in_iov[in_num - 1].iov_len < sizeof(struct virtio_blk_inhdr)) {
214
+ error_report("virtio-blk request inhdr too short");
215
+ goto err;
216
+ }
217
+
218
+ /* We always touch the last byte, so just see how big in_iov is. */
219
+ req->in = (void *)in_iov[in_num - 1].iov_base
220
+ + in_iov[in_num - 1].iov_len
221
+ - sizeof(struct virtio_blk_inhdr);
222
+ iov_discard_back(in_iov, &in_num, sizeof(struct virtio_blk_inhdr));
223
+
224
+ type = le32_to_cpu(req->out.type);
225
+ switch (type & ~VIRTIO_BLK_T_BARRIER) {
226
+ case VIRTIO_BLK_T_IN:
227
+ case VIRTIO_BLK_T_OUT: {
228
+ ssize_t ret = 0;
229
+ bool is_write = type & VIRTIO_BLK_T_OUT;
230
+ req->sector_num = le64_to_cpu(req->out.sector);
231
+
232
+ int64_t offset = req->sector_num * vdev_blk->blk_size;
233
+ QEMUIOVector qiov;
234
+ if (is_write) {
235
+ qemu_iovec_init_external(&qiov, out_iov, out_num);
236
+ ret = blk_co_pwritev(backend, offset, qiov.size,
237
+ &qiov, 0);
238
+ } else {
239
+ qemu_iovec_init_external(&qiov, in_iov, in_num);
240
+ ret = blk_co_preadv(backend, offset, qiov.size,
241
+ &qiov, 0);
242
+ }
243
+ if (ret >= 0) {
244
+ req->in->status = VIRTIO_BLK_S_OK;
245
+ } else {
246
+ req->in->status = VIRTIO_BLK_S_IOERR;
247
+ }
248
+ break;
249
+ }
250
+ case VIRTIO_BLK_T_FLUSH:
251
+ vu_block_flush(req);
252
+ req->in->status = VIRTIO_BLK_S_OK;
253
+ break;
254
+ case VIRTIO_BLK_T_GET_ID: {
255
+ size_t size = MIN(iov_size(&elem->in_sg[0], in_num),
256
+ VIRTIO_BLK_ID_BYTES);
257
+ snprintf(elem->in_sg[0].iov_base, size, "%s", "vhost_user_blk");
258
+ req->in->status = VIRTIO_BLK_S_OK;
259
+ req->size = elem->in_sg[0].iov_len;
260
+ break;
261
+ }
262
+ case VIRTIO_BLK_T_DISCARD:
263
+ case VIRTIO_BLK_T_WRITE_ZEROES: {
264
+ int rc;
265
+ rc = vu_block_discard_write_zeroes(req, &elem->out_sg[1],
266
+ out_num, type);
267
+ if (rc == 0) {
268
+ req->in->status = VIRTIO_BLK_S_OK;
269
+ } else {
270
+ req->in->status = VIRTIO_BLK_S_IOERR;
271
+ }
272
+ break;
273
+ }
274
+ default:
275
+ req->in->status = VIRTIO_BLK_S_UNSUPP;
276
+ break;
277
+ }
278
+
279
+ vu_block_req_complete(req);
280
+ return;
281
+
282
+err:
283
+ free(elem);
284
+ g_free(req);
285
+ return;
286
+}
287
+
288
+static void vu_block_process_vq(VuDev *vu_dev, int idx)
289
+{
290
+ VuServer *server;
291
+ VuVirtq *vq;
292
+ struct req_data *req_data;
293
+
294
+ server = container_of(vu_dev, VuServer, vu_dev);
295
+ assert(server);
296
+
297
+ vq = vu_get_queue(vu_dev, idx);
298
+ assert(vq);
299
+ VuVirtqElement *elem;
300
+ while (1) {
301
+ elem = vu_queue_pop(vu_dev, vq, sizeof(VuVirtqElement) +
302
+ sizeof(VuBlockReq));
303
+ if (elem) {
304
+ req_data = g_new0(struct req_data, 1);
305
+ req_data->server = server;
306
+ req_data->vq = vq;
307
+ req_data->elem = elem;
308
+ Coroutine *co = qemu_coroutine_create(vu_block_virtio_process_req,
309
+ req_data);
310
+ aio_co_enter(server->ioc->ctx, co);
311
+ } else {
312
+ break;
313
+ }
314
+ }
315
+}
316
+
317
+static void vu_block_queue_set_started(VuDev *vu_dev, int idx, bool started)
318
+{
319
+ VuVirtq *vq;
320
+
321
+ assert(vu_dev);
322
+
323
+ vq = vu_get_queue(vu_dev, idx);
324
+ vu_set_queue_handler(vu_dev, vq, started ? vu_block_process_vq : NULL);
325
+}
326
+
327
+static uint64_t vu_block_get_features(VuDev *dev)
328
+{
329
+ uint64_t features;
330
+ VuServer *server = container_of(dev, VuServer, vu_dev);
331
+ VuBlockDev *vdev_blk = get_vu_block_device_by_server(server);
332
+ features = 1ull << VIRTIO_BLK_F_SIZE_MAX |
333
+ 1ull << VIRTIO_BLK_F_SEG_MAX |
334
+ 1ull << VIRTIO_BLK_F_TOPOLOGY |
335
+ 1ull << VIRTIO_BLK_F_BLK_SIZE |
336
+ 1ull << VIRTIO_BLK_F_FLUSH |
337
+ 1ull << VIRTIO_BLK_F_DISCARD |
338
+ 1ull << VIRTIO_BLK_F_WRITE_ZEROES |
339
+ 1ull << VIRTIO_BLK_F_CONFIG_WCE |
340
+ 1ull << VIRTIO_F_VERSION_1 |
341
+ 1ull << VIRTIO_RING_F_INDIRECT_DESC |
342
+ 1ull << VIRTIO_RING_F_EVENT_IDX |
343
+ 1ull << VHOST_USER_F_PROTOCOL_FEATURES;
344
+
345
+ if (!vdev_blk->writable) {
346
+ features |= 1ull << VIRTIO_BLK_F_RO;
347
+ }
348
+
349
+ return features;
350
+}
351
+
352
+static uint64_t vu_block_get_protocol_features(VuDev *dev)
353
+{
354
+ return 1ull << VHOST_USER_PROTOCOL_F_CONFIG |
355
+ 1ull << VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD;
356
+}
357
+
358
+static int
359
+vu_block_get_config(VuDev *vu_dev, uint8_t *config, uint32_t len)
360
+{
361
+ VuServer *server = container_of(vu_dev, VuServer, vu_dev);
362
+ VuBlockDev *vdev_blk = get_vu_block_device_by_server(server);
363
+ memcpy(config, &vdev_blk->blkcfg, len);
364
+
365
+ return 0;
366
+}
367
+
368
+static int
369
+vu_block_set_config(VuDev *vu_dev, const uint8_t *data,
370
+ uint32_t offset, uint32_t size, uint32_t flags)
371
+{
372
+ VuServer *server = container_of(vu_dev, VuServer, vu_dev);
373
+ VuBlockDev *vdev_blk = get_vu_block_device_by_server(server);
374
+ uint8_t wce;
375
+
376
+ /* don't support live migration */
377
+ if (flags != VHOST_SET_CONFIG_TYPE_MASTER) {
378
+ return -EINVAL;
379
+ }
380
+
381
+ if (offset != offsetof(struct virtio_blk_config, wce) ||
382
+ size != 1) {
383
+ return -EINVAL;
384
+ }
385
+
386
+ wce = *data;
387
+ vdev_blk->blkcfg.wce = wce;
388
+ blk_set_enable_write_cache(vdev_blk->backend, wce);
389
+ return 0;
390
+}
391
+
392
+/*
393
+ * When the client disconnects, it sends a VHOST_USER_NONE request
394
+ * and vu_process_message will simple call exit which cause the VM
395
+ * to exit abruptly.
396
+ * To avoid this issue, process VHOST_USER_NONE request ahead
397
+ * of vu_process_message.
398
+ *
399
+ */
400
+static int vu_block_process_msg(VuDev *dev, VhostUserMsg *vmsg, int *do_reply)
401
+{
402
+ if (vmsg->request == VHOST_USER_NONE) {
403
+ dev->panic(dev, "disconnect");
404
+ return true;
405
+ }
406
+ return false;
407
+}
408
+
409
+static const VuDevIface vu_block_iface = {
410
+ .get_features = vu_block_get_features,
411
+ .queue_set_started = vu_block_queue_set_started,
412
+ .get_protocol_features = vu_block_get_protocol_features,
413
+ .get_config = vu_block_get_config,
414
+ .set_config = vu_block_set_config,
415
+ .process_msg = vu_block_process_msg,
416
+};
417
+
418
+static void blk_aio_attached(AioContext *ctx, void *opaque)
419
+{
420
+ VuBlockDev *vub_dev = opaque;
421
+ aio_context_acquire(ctx);
422
+ vhost_user_server_set_aio_context(&vub_dev->vu_server, ctx);
423
+ aio_context_release(ctx);
424
+}
425
+
426
+static void blk_aio_detach(void *opaque)
427
+{
428
+ VuBlockDev *vub_dev = opaque;
429
+ AioContext *ctx = vub_dev->vu_server.ctx;
430
+ aio_context_acquire(ctx);
431
+ vhost_user_server_set_aio_context(&vub_dev->vu_server, NULL);
432
+ aio_context_release(ctx);
433
+}
434
+
435
+static void
436
+vu_block_initialize_config(BlockDriverState *bs,
437
+ struct virtio_blk_config *config, uint32_t blk_size)
438
+{
439
+ config->capacity = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
440
+ config->blk_size = blk_size;
441
+ config->size_max = 0;
442
+ config->seg_max = 128 - 2;
443
+ config->min_io_size = 1;
444
+ config->opt_io_size = 1;
445
+ config->num_queues = VHOST_USER_BLK_MAX_QUEUES;
446
+ config->max_discard_sectors = 32768;
447
+ config->max_discard_seg = 1;
448
+ config->discard_sector_alignment = config->blk_size >> 9;
449
+ config->max_write_zeroes_sectors = 32768;
450
+ config->max_write_zeroes_seg = 1;
451
+}
452
+
453
+static VuBlockDev *vu_block_init(VuBlockDev *vu_block_device, Error **errp)
454
+{
455
+
456
+ BlockBackend *blk;
457
+ Error *local_error = NULL;
458
+ const char *node_name = vu_block_device->node_name;
459
+ bool writable = vu_block_device->writable;
460
+ uint64_t perm = BLK_PERM_CONSISTENT_READ;
461
+ int ret;
462
+
463
+ AioContext *ctx;
464
+
465
+ BlockDriverState *bs = bdrv_lookup_bs(node_name, node_name, &local_error);
466
+
467
+ if (!bs) {
468
+ error_propagate(errp, local_error);
469
+ return NULL;
470
+ }
471
+
472
+ if (bdrv_is_read_only(bs)) {
473
+ writable = false;
474
+ }
475
+
476
+ if (writable) {
477
+ perm |= BLK_PERM_WRITE;
478
+ }
479
+
480
+ ctx = bdrv_get_aio_context(bs);
481
+ aio_context_acquire(ctx);
482
+ bdrv_invalidate_cache(bs, NULL);
483
+ aio_context_release(ctx);
484
+
485
+ /*
486
+ * Don't allow resize while the vhost user server is running,
487
+ * otherwise we don't care what happens with the node.
488
+ */
489
+ blk = blk_new(bdrv_get_aio_context(bs), perm,
490
+ BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE_UNCHANGED |
491
+ BLK_PERM_WRITE | BLK_PERM_GRAPH_MOD);
492
+ ret = blk_insert_bs(blk, bs, errp);
493
+
494
+ if (ret < 0) {
495
+ goto fail;
496
+ }
497
+
498
+ blk_set_enable_write_cache(blk, false);
499
+
500
+ blk_set_allow_aio_context_change(blk, true);
501
+
502
+ vu_block_device->blkcfg.wce = 0;
503
+ vu_block_device->backend = blk;
504
+ if (!vu_block_device->blk_size) {
505
+ vu_block_device->blk_size = BDRV_SECTOR_SIZE;
506
+ }
507
+ vu_block_device->blkcfg.blk_size = vu_block_device->blk_size;
508
+ blk_set_guest_block_size(blk, vu_block_device->blk_size);
509
+ vu_block_initialize_config(bs, &vu_block_device->blkcfg,
510
+ vu_block_device->blk_size);
511
+ return vu_block_device;
512
+
513
+fail:
514
+ blk_unref(blk);
515
+ return NULL;
516
+}
517
+
518
+static void vu_block_deinit(VuBlockDev *vu_block_device)
519
+{
520
+ if (vu_block_device->backend) {
521
+ blk_remove_aio_context_notifier(vu_block_device->backend, blk_aio_attached,
522
+ blk_aio_detach, vu_block_device);
523
+ }
524
+
525
+ blk_unref(vu_block_device->backend);
526
+}
527
+
528
+static void vhost_user_blk_server_stop(VuBlockDev *vu_block_device)
529
+{
530
+ vhost_user_server_stop(&vu_block_device->vu_server);
531
+ vu_block_deinit(vu_block_device);
532
+}
533
+
534
+static void vhost_user_blk_server_start(VuBlockDev *vu_block_device,
535
+ Error **errp)
536
+{
537
+ AioContext *ctx;
538
+ SocketAddress *addr = vu_block_device->addr;
539
+
540
+ if (!vu_block_init(vu_block_device, errp)) {
541
+ return;
542
+ }
543
+
544
+ ctx = bdrv_get_aio_context(blk_bs(vu_block_device->backend));
545
+
546
+ if (!vhost_user_server_start(&vu_block_device->vu_server, addr, ctx,
547
+ VHOST_USER_BLK_MAX_QUEUES,
548
+ NULL, &vu_block_iface,
549
+ errp)) {
550
+ goto error;
551
+ }
552
+
553
+ blk_add_aio_context_notifier(vu_block_device->backend, blk_aio_attached,
554
+ blk_aio_detach, vu_block_device);
555
+ vu_block_device->running = true;
556
+ return;
557
+
558
+ error:
559
+ vu_block_deinit(vu_block_device);
560
+}
561
+
562
+static bool vu_prop_modifiable(VuBlockDev *vus, Error **errp)
563
+{
564
+ if (vus->running) {
565
+ error_setg(errp, "The property can't be modified "
566
+ "while the server is running");
567
+ return false;
568
+ }
569
+ return true;
570
+}
571
+
572
+static void vu_set_node_name(Object *obj, const char *value, Error **errp)
573
+{
574
+ VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
575
+
576
+ if (!vu_prop_modifiable(vus, errp)) {
577
+ return;
578
+ }
579
+
580
+ if (vus->node_name) {
581
+ g_free(vus->node_name);
582
+ }
583
+
584
+ vus->node_name = g_strdup(value);
585
+}
586
+
587
+static char *vu_get_node_name(Object *obj, Error **errp)
588
+{
589
+ VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
590
+ return g_strdup(vus->node_name);
591
+}
592
+
593
+static void free_socket_addr(SocketAddress *addr)
594
+{
595
+ g_free(addr->u.q_unix.path);
596
+ g_free(addr);
597
+}
598
+
599
+static void vu_set_unix_socket(Object *obj, const char *value,
600
+ Error **errp)
601
+{
602
+ VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
603
+
604
+ if (!vu_prop_modifiable(vus, errp)) {
605
+ return;
606
+ }
607
+
608
+ if (vus->addr) {
609
+ free_socket_addr(vus->addr);
610
+ }
611
+
612
+ SocketAddress *addr = g_new0(SocketAddress, 1);
613
+ addr->type = SOCKET_ADDRESS_TYPE_UNIX;
614
+ addr->u.q_unix.path = g_strdup(value);
615
+ vus->addr = addr;
616
+}
617
+
618
+static char *vu_get_unix_socket(Object *obj, Error **errp)
619
+{
620
+ VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
621
+ return g_strdup(vus->addr->u.q_unix.path);
622
+}
623
+
624
+static bool vu_get_block_writable(Object *obj, Error **errp)
625
+{
626
+ VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
627
+ return vus->writable;
628
+}
629
+
630
+static void vu_set_block_writable(Object *obj, bool value, Error **errp)
631
+{
632
+ VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
633
+
634
+ if (!vu_prop_modifiable(vus, errp)) {
635
+ return;
636
+ }
637
+
638
+ vus->writable = value;
639
+}
640
+
641
+static void vu_get_blk_size(Object *obj, Visitor *v, const char *name,
642
+ void *opaque, Error **errp)
643
+{
644
+ VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
645
+ uint32_t value = vus->blk_size;
646
+
647
+ visit_type_uint32(v, name, &value, errp);
648
+}
649
+
650
+static void vu_set_blk_size(Object *obj, Visitor *v, const char *name,
651
+ void *opaque, Error **errp)
652
+{
653
+ VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
654
+
655
+ Error *local_err = NULL;
656
+ uint32_t value;
657
+
658
+ if (!vu_prop_modifiable(vus, errp)) {
659
+ return;
660
+ }
661
+
662
+ visit_type_uint32(v, name, &value, &local_err);
663
+ if (local_err) {
664
+ goto out;
665
+ }
666
+
667
+ check_block_size(object_get_typename(obj), name, value, &local_err);
668
+ if (local_err) {
669
+ goto out;
670
+ }
671
+
672
+ vus->blk_size = value;
673
+
674
+out:
675
+ error_propagate(errp, local_err);
676
+}
677
+
678
+static void vhost_user_blk_server_instance_finalize(Object *obj)
679
+{
680
+ VuBlockDev *vub = VHOST_USER_BLK_SERVER(obj);
681
+
682
+ vhost_user_blk_server_stop(vub);
683
+
684
+ /*
685
+ * Unlike object_property_add_str, object_class_property_add_str
686
+ * doesn't have a release method. Thus manual memory freeing is
687
+ * needed.
688
+ */
689
+ free_socket_addr(vub->addr);
690
+ g_free(vub->node_name);
691
+}
692
+
693
+static void vhost_user_blk_server_complete(UserCreatable *obj, Error **errp)
694
+{
695
+ VuBlockDev *vub = VHOST_USER_BLK_SERVER(obj);
696
+
697
+ vhost_user_blk_server_start(vub, errp);
698
+}
699
+
700
+static void vhost_user_blk_server_class_init(ObjectClass *klass,
701
+ void *class_data)
702
+{
703
+ UserCreatableClass *ucc = USER_CREATABLE_CLASS(klass);
704
+ ucc->complete = vhost_user_blk_server_complete;
705
+
706
+ object_class_property_add_bool(klass, "writable",
707
+ vu_get_block_writable,
708
+ vu_set_block_writable);
709
+
710
+ object_class_property_add_str(klass, "node-name",
711
+ vu_get_node_name,
712
+ vu_set_node_name);
713
+
714
+ object_class_property_add_str(klass, "unix-socket",
715
+ vu_get_unix_socket,
716
+ vu_set_unix_socket);
717
+
718
+ object_class_property_add(klass, "logical-block-size", "uint32",
719
+ vu_get_blk_size, vu_set_blk_size,
720
+ NULL, NULL);
721
+}
722
+
723
+static const TypeInfo vhost_user_blk_server_info = {
724
+ .name = TYPE_VHOST_USER_BLK_SERVER,
725
+ .parent = TYPE_OBJECT,
726
+ .instance_size = sizeof(VuBlockDev),
727
+ .instance_finalize = vhost_user_blk_server_instance_finalize,
728
+ .class_init = vhost_user_blk_server_class_init,
729
+ .interfaces = (InterfaceInfo[]) {
730
+ {TYPE_USER_CREATABLE},
731
+ {}
732
+ },
733
+};
734
+
735
+static void vhost_user_blk_server_register_types(void)
736
+{
737
+ type_register_static(&vhost_user_blk_server_info);
738
+}
739
+
740
+type_init(vhost_user_blk_server_register_types)
741
diff --git a/softmmu/vl.c b/softmmu/vl.c
742
index XXXXXXX..XXXXXXX 100644
743
--- a/softmmu/vl.c
744
+++ b/softmmu/vl.c
745
@@ -XXX,XX +XXX,XX @@ static bool object_create_initial(const char *type, QemuOpts *opts)
746
}
747
#endif
748
749
+ /* Reason: vhost-user-blk-server property "node-name" */
750
+ if (g_str_equal(type, "vhost-user-blk-server")) {
751
+ return false;
752
+ }
753
/*
754
* Reason: filter-* property "netdev" etc.
755
*/
756
diff --git a/block/meson.build b/block/meson.build
757
index XXXXXXX..XXXXXXX 100644
758
--- a/block/meson.build
759
+++ b/block/meson.build
760
@@ -XXX,XX +XXX,XX @@ block_ss.add(when: 'CONFIG_WIN32', if_true: files('file-win32.c', 'win32-aio.c')
761
block_ss.add(when: 'CONFIG_POSIX', if_true: [files('file-posix.c'), coref, iokit])
762
block_ss.add(when: 'CONFIG_LIBISCSI', if_true: files('iscsi-opts.c'))
763
block_ss.add(when: 'CONFIG_LINUX', if_true: files('nvme.c'))
764
+block_ss.add(when: 'CONFIG_LINUX', if_true: files('export/vhost-user-blk-server.c', '../contrib/libvhost-user/libvhost-user.c'))
765
block_ss.add(when: 'CONFIG_REPLICATION', if_true: files('replication.c'))
766
block_ss.add(when: 'CONFIG_SHEEPDOG', if_true: files('sheepdog.c'))
767
block_ss.add(when: ['CONFIG_LINUX_AIO', libaio], if_true: files('linux-aio.c'))
768
--
769
2.26.2
770
diff view generated by jsdifflib
New patch
1
From: Coiby Xu <coiby.xu@gmail.com>
1
2
3
Suggested-by: Stefano Garzarella <sgarzare@redhat.com>
4
Signed-off-by: Coiby Xu <coiby.xu@gmail.com>
5
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
6
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
7
Message-id: 20200918080912.321299-8-coiby.xu@gmail.com
8
[Removed reference to vhost-user-blk-test.c, it will be sent in a
9
separate pull request.
10
--Stefan]
11
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
12
---
13
MAINTAINERS | 7 +++++++
14
1 file changed, 7 insertions(+)
15
16
diff --git a/MAINTAINERS b/MAINTAINERS
17
index XXXXXXX..XXXXXXX 100644
18
--- a/MAINTAINERS
19
+++ b/MAINTAINERS
20
@@ -XXX,XX +XXX,XX @@ L: qemu-block@nongnu.org
21
S: Supported
22
F: tests/image-fuzzer/
23
24
+Vhost-user block device backend server
25
+M: Coiby Xu <Coiby.Xu@gmail.com>
26
+S: Maintained
27
+F: block/export/vhost-user-blk-server.c
28
+F: util/vhost-user-server.c
29
+F: tests/qtest/libqos/vhost-user-blk.c
30
+
31
Replication
32
M: Wen Congyang <wencongyang2@huawei.com>
33
M: Xie Changlong <xiechanglong.d@gmail.com>
34
--
35
2.26.2
36
diff view generated by jsdifflib
1
vpc is not really a passthrough driver, even when using the fixed
1
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2
subformat (where host and guest offsets are equal). It should handle
2
Message-id: 20200924151549.913737-3-stefanha@redhat.com
3
preallocation like all other drivers do, namely by returning
3
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
4
DATA | RECURSE instead of RAW.
5
6
There is no tangible difference but the fact that bdrv_is_allocated() no
7
longer falls through to the protocol layer.
8
9
Signed-off-by: Max Reitz <mreitz@redhat.com>
10
Message-id: 20190725155512.9827-4-mreitz@redhat.com
11
Reviewed-by: John Snow <jsnow@redhat.com>
12
Signed-off-by: Max Reitz <mreitz@redhat.com>
13
---
4
---
14
block/vpc.c | 2 +-
5
util/vhost-user-server.c | 2 +-
15
1 file changed, 1 insertion(+), 1 deletion(-)
6
1 file changed, 1 insertion(+), 1 deletion(-)
16
7
17
diff --git a/block/vpc.c b/block/vpc.c
8
diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
18
index XXXXXXX..XXXXXXX 100644
9
index XXXXXXX..XXXXXXX 100644
19
--- a/block/vpc.c
10
--- a/util/vhost-user-server.c
20
+++ b/block/vpc.c
11
+++ b/util/vhost-user-server.c
21
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn vpc_co_block_status(BlockDriverState *bs,
12
@@ -XXX,XX +XXX,XX @@ bool vhost_user_server_start(VuServer *server,
22
*pnum = bytes;
13
return false;
23
*map = offset;
24
*file = bs->file->bs;
25
- return BDRV_BLOCK_RAW | BDRV_BLOCK_OFFSET_VALID;
26
+ return BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID | BDRV_BLOCK_RECURSE;
27
}
14
}
28
15
29
qemu_co_mutex_lock(&s->lock);
16
- /* zero out unspecified fileds */
17
+ /* zero out unspecified fields */
18
*server = (VuServer) {
19
.listener = listener,
20
.vu_iface = vu_iface,
30
--
21
--
31
2.21.0
22
2.26.2
32
23
33
diff view generated by jsdifflib
New patch
1
We already have access to the value with the correct type (ioc and sioc
2
are the same QIOChannel).
1
3
4
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
5
Message-id: 20200924151549.913737-4-stefanha@redhat.com
6
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
7
---
8
util/vhost-user-server.c | 2 +-
9
1 file changed, 1 insertion(+), 1 deletion(-)
10
11
diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
12
index XXXXXXX..XXXXXXX 100644
13
--- a/util/vhost-user-server.c
14
+++ b/util/vhost-user-server.c
15
@@ -XXX,XX +XXX,XX @@ static void vu_accept(QIONetListener *listener, QIOChannelSocket *sioc,
16
server->ioc = QIO_CHANNEL(sioc);
17
object_ref(OBJECT(server->ioc));
18
qio_channel_attach_aio_context(server->ioc, server->ctx);
19
- qio_channel_set_blocking(QIO_CHANNEL(server->sioc), false, NULL);
20
+ qio_channel_set_blocking(server->ioc, false, NULL);
21
vu_client_start(server);
22
}
23
24
--
25
2.26.2
26
diff view generated by jsdifflib
New patch
1
Explicitly deleting watches is not necessary since libvhost-user calls
2
remove_watch() during vu_deinit(). Add an assertion to check this
3
though.
1
4
5
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
6
Message-id: 20200924151549.913737-5-stefanha@redhat.com
7
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
8
---
9
util/vhost-user-server.c | 19 ++++---------------
10
1 file changed, 4 insertions(+), 15 deletions(-)
11
12
diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/util/vhost-user-server.c
15
+++ b/util/vhost-user-server.c
16
@@ -XXX,XX +XXX,XX @@ static void close_client(VuServer *server)
17
/* When this is set vu_client_trip will stop new processing vhost-user message */
18
server->sioc = NULL;
19
20
- VuFdWatch *vu_fd_watch, *next;
21
- QTAILQ_FOREACH_SAFE(vu_fd_watch, &server->vu_fd_watches, next, next) {
22
- aio_set_fd_handler(server->ioc->ctx, vu_fd_watch->fd, true, NULL,
23
- NULL, NULL, NULL);
24
- }
25
-
26
- while (!QTAILQ_EMPTY(&server->vu_fd_watches)) {
27
- QTAILQ_FOREACH_SAFE(vu_fd_watch, &server->vu_fd_watches, next, next) {
28
- if (!vu_fd_watch->processing) {
29
- QTAILQ_REMOVE(&server->vu_fd_watches, vu_fd_watch, next);
30
- g_free(vu_fd_watch);
31
- }
32
- }
33
- }
34
-
35
while (server->processing_msg) {
36
if (server->ioc->read_coroutine) {
37
server->ioc->read_coroutine = NULL;
38
@@ -XXX,XX +XXX,XX @@ static void close_client(VuServer *server)
39
}
40
41
vu_deinit(&server->vu_dev);
42
+
43
+ /* vu_deinit() should have called remove_watch() */
44
+ assert(QTAILQ_EMPTY(&server->vu_fd_watches));
45
+
46
object_unref(OBJECT(sioc));
47
object_unref(OBJECT(server->ioc));
48
}
49
--
50
2.26.2
51
diff view generated by jsdifflib
1
Static VDI images cannot guarantee to be zero-initialized. If the image
1
Only one struct is needed per request. Drop req_data and the separate
2
has been statically allocated, forward the call to the underlying
2
VuBlockReq instance. Instead let vu_queue_pop() allocate everything at
3
storage node.
3
once.
4
4
5
Reported-by: Stefano Garzarella <sgarzare@redhat.com>
5
This fixes the req_data memory leak in vu_block_virtio_process_req().
6
Signed-off-by: Max Reitz <mreitz@redhat.com>
6
7
Reviewed-by: Stefan Weil <sw@weilnetz.de>
7
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
8
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
8
Message-id: 20200924151549.913737-6-stefanha@redhat.com
9
Tested-by: Stefano Garzarella <sgarzare@redhat.com>
9
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10
Message-id: 20190724171239.8764-8-mreitz@redhat.com
11
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
12
Signed-off-by: Max Reitz <mreitz@redhat.com>
13
---
10
---
14
block/vdi.c | 13 ++++++++++++-
11
block/export/vhost-user-blk-server.c | 68 +++++++++-------------------
15
1 file changed, 12 insertions(+), 1 deletion(-)
12
1 file changed, 21 insertions(+), 47 deletions(-)
16
13
17
diff --git a/block/vdi.c b/block/vdi.c
14
diff --git a/block/export/vhost-user-blk-server.c b/block/export/vhost-user-blk-server.c
18
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
19
--- a/block/vdi.c
16
--- a/block/export/vhost-user-blk-server.c
20
+++ b/block/vdi.c
17
+++ b/block/export/vhost-user-blk-server.c
21
@@ -XXX,XX +XXX,XX @@ static void vdi_close(BlockDriverState *bs)
18
@@ -XXX,XX +XXX,XX @@ struct virtio_blk_inhdr {
22
error_free(s->migration_blocker);
19
};
20
21
typedef struct VuBlockReq {
22
- VuVirtqElement *elem;
23
+ VuVirtqElement elem;
24
int64_t sector_num;
25
size_t size;
26
struct virtio_blk_inhdr *in;
27
@@ -XXX,XX +XXX,XX @@ static void vu_block_req_complete(VuBlockReq *req)
28
VuDev *vu_dev = &req->server->vu_dev;
29
30
/* IO size with 1 extra status byte */
31
- vu_queue_push(vu_dev, req->vq, req->elem, req->size + 1);
32
+ vu_queue_push(vu_dev, req->vq, &req->elem, req->size + 1);
33
vu_queue_notify(vu_dev, req->vq);
34
35
- if (req->elem) {
36
- free(req->elem);
37
- }
38
-
39
- g_free(req);
40
+ free(req);
23
}
41
}
24
42
25
+static int vdi_has_zero_init(BlockDriverState *bs)
43
static VuBlockDev *get_vu_block_device_by_server(VuServer *server)
26
+{
44
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn vu_block_flush(VuBlockReq *req)
27
+ BDRVVdiState *s = bs->opaque;
45
blk_co_flush(backend);
46
}
47
48
-struct req_data {
49
- VuServer *server;
50
- VuVirtq *vq;
51
- VuVirtqElement *elem;
52
-};
53
-
54
static void coroutine_fn vu_block_virtio_process_req(void *opaque)
55
{
56
- struct req_data *data = opaque;
57
- VuServer *server = data->server;
58
- VuVirtq *vq = data->vq;
59
- VuVirtqElement *elem = data->elem;
60
+ VuBlockReq *req = opaque;
61
+ VuServer *server = req->server;
62
+ VuVirtqElement *elem = &req->elem;
63
uint32_t type;
64
- VuBlockReq *req;
65
66
VuBlockDev *vdev_blk = get_vu_block_device_by_server(server);
67
BlockBackend *backend = vdev_blk->backend;
68
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn vu_block_virtio_process_req(void *opaque)
69
struct iovec *out_iov = elem->out_sg;
70
unsigned in_num = elem->in_num;
71
unsigned out_num = elem->out_num;
28
+
72
+
29
+ if (s->header.image_type == VDI_TYPE_STATIC) {
73
/* refer to hw/block/virtio_blk.c */
30
+ return bdrv_has_zero_init(bs->file->bs);
74
if (elem->out_num < 1 || elem->in_num < 1) {
31
+ } else {
75
error_report("virtio-blk request missing headers");
32
+ return 1;
76
- free(elem);
33
+ }
77
- return;
34
+}
78
+ goto err;
79
}
80
81
- req = g_new0(VuBlockReq, 1);
82
- req->server = server;
83
- req->vq = vq;
84
- req->elem = elem;
85
-
86
if (unlikely(iov_to_buf(out_iov, out_num, 0, &req->out,
87
sizeof(req->out)) != sizeof(req->out))) {
88
error_report("virtio-blk request outhdr too short");
89
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn vu_block_virtio_process_req(void *opaque)
90
91
err:
92
free(elem);
93
- g_free(req);
94
- return;
95
}
96
97
static void vu_block_process_vq(VuDev *vu_dev, int idx)
98
{
99
- VuServer *server;
100
- VuVirtq *vq;
101
- struct req_data *req_data;
102
+ VuServer *server = container_of(vu_dev, VuServer, vu_dev);
103
+ VuVirtq *vq = vu_get_queue(vu_dev, idx);
104
105
- server = container_of(vu_dev, VuServer, vu_dev);
106
- assert(server);
107
-
108
- vq = vu_get_queue(vu_dev, idx);
109
- assert(vq);
110
- VuVirtqElement *elem;
111
while (1) {
112
- elem = vu_queue_pop(vu_dev, vq, sizeof(VuVirtqElement) +
113
- sizeof(VuBlockReq));
114
- if (elem) {
115
- req_data = g_new0(struct req_data, 1);
116
- req_data->server = server;
117
- req_data->vq = vq;
118
- req_data->elem = elem;
119
- Coroutine *co = qemu_coroutine_create(vu_block_virtio_process_req,
120
- req_data);
121
- aio_co_enter(server->ioc->ctx, co);
122
- } else {
123
+ VuBlockReq *req;
35
+
124
+
36
static QemuOptsList vdi_create_opts = {
125
+ req = vu_queue_pop(vu_dev, vq, sizeof(VuBlockReq));
37
.name = "vdi-create-opts",
126
+ if (!req) {
38
.head = QTAILQ_HEAD_INITIALIZER(vdi_create_opts.head),
127
break;
39
@@ -XXX,XX +XXX,XX @@ static BlockDriver bdrv_vdi = {
128
}
40
.bdrv_child_perm = bdrv_format_default_perms,
129
+
41
.bdrv_co_create = vdi_co_create,
130
+ req->server = server;
42
.bdrv_co_create_opts = vdi_co_create_opts,
131
+ req->vq = vq;
43
- .bdrv_has_zero_init = bdrv_has_zero_init_1,
132
+
44
+ .bdrv_has_zero_init = vdi_has_zero_init,
133
+ Coroutine *co =
45
.bdrv_co_block_status = vdi_co_block_status,
134
+ qemu_coroutine_create(vu_block_virtio_process_req, req);
46
.bdrv_make_empty = vdi_make_empty,
135
+ qemu_coroutine_enter(co);
136
}
137
}
47
138
48
--
139
--
49
2.21.0
140
2.26.2
50
141
51
diff view generated by jsdifflib
New patch
1
The device panic notifier callback is not used. Drop it.
1
2
3
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
4
Message-id: 20200924151549.913737-7-stefanha@redhat.com
5
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
6
---
7
util/vhost-user-server.h | 3 ---
8
block/export/vhost-user-blk-server.c | 3 +--
9
util/vhost-user-server.c | 6 ------
10
3 files changed, 1 insertion(+), 11 deletions(-)
11
12
diff --git a/util/vhost-user-server.h b/util/vhost-user-server.h
13
index XXXXXXX..XXXXXXX 100644
14
--- a/util/vhost-user-server.h
15
+++ b/util/vhost-user-server.h
16
@@ -XXX,XX +XXX,XX @@ typedef struct VuFdWatch {
17
} VuFdWatch;
18
19
typedef struct VuServer VuServer;
20
-typedef void DevicePanicNotifierFn(VuServer *server);
21
22
struct VuServer {
23
QIONetListener *listener;
24
AioContext *ctx;
25
- DevicePanicNotifierFn *device_panic_notifier;
26
int max_queues;
27
const VuDevIface *vu_iface;
28
VuDev vu_dev;
29
@@ -XXX,XX +XXX,XX @@ bool vhost_user_server_start(VuServer *server,
30
SocketAddress *unix_socket,
31
AioContext *ctx,
32
uint16_t max_queues,
33
- DevicePanicNotifierFn *device_panic_notifier,
34
const VuDevIface *vu_iface,
35
Error **errp);
36
37
diff --git a/block/export/vhost-user-blk-server.c b/block/export/vhost-user-blk-server.c
38
index XXXXXXX..XXXXXXX 100644
39
--- a/block/export/vhost-user-blk-server.c
40
+++ b/block/export/vhost-user-blk-server.c
41
@@ -XXX,XX +XXX,XX @@ static void vhost_user_blk_server_start(VuBlockDev *vu_block_device,
42
ctx = bdrv_get_aio_context(blk_bs(vu_block_device->backend));
43
44
if (!vhost_user_server_start(&vu_block_device->vu_server, addr, ctx,
45
- VHOST_USER_BLK_MAX_QUEUES,
46
- NULL, &vu_block_iface,
47
+ VHOST_USER_BLK_MAX_QUEUES, &vu_block_iface,
48
errp)) {
49
goto error;
50
}
51
diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
52
index XXXXXXX..XXXXXXX 100644
53
--- a/util/vhost-user-server.c
54
+++ b/util/vhost-user-server.c
55
@@ -XXX,XX +XXX,XX @@ static void panic_cb(VuDev *vu_dev, const char *buf)
56
close_client(server);
57
}
58
59
- if (server->device_panic_notifier) {
60
- server->device_panic_notifier(server);
61
- }
62
-
63
/*
64
* Set the callback function for network listener so another
65
* vhost-user client can connect to this server
66
@@ -XXX,XX +XXX,XX @@ bool vhost_user_server_start(VuServer *server,
67
SocketAddress *socket_addr,
68
AioContext *ctx,
69
uint16_t max_queues,
70
- DevicePanicNotifierFn *device_panic_notifier,
71
const VuDevIface *vu_iface,
72
Error **errp)
73
{
74
@@ -XXX,XX +XXX,XX @@ bool vhost_user_server_start(VuServer *server,
75
.vu_iface = vu_iface,
76
.max_queues = max_queues,
77
.ctx = ctx,
78
- .device_panic_notifier = device_panic_notifier,
79
};
80
81
qio_net_listener_set_name(server->listener, "vhost-user-backend-listener");
82
--
83
2.26.2
84
diff view generated by jsdifflib
New patch
1
fds[] is leaked when qio_channel_readv_full() fails.
1
2
3
Use vmsg->fds[] instead of keeping a local fds[] array. Then we can
4
reuse goto fail to clean up fds. vmsg->fd_num must be zeroed before the
5
loop to make this safe.
6
7
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
8
Message-id: 20200924151549.913737-8-stefanha@redhat.com
9
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10
---
11
util/vhost-user-server.c | 50 ++++++++++++++++++----------------------
12
1 file changed, 23 insertions(+), 27 deletions(-)
13
14
diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
15
index XXXXXXX..XXXXXXX 100644
16
--- a/util/vhost-user-server.c
17
+++ b/util/vhost-user-server.c
18
@@ -XXX,XX +XXX,XX @@ vu_message_read(VuDev *vu_dev, int conn_fd, VhostUserMsg *vmsg)
19
};
20
int rc, read_bytes = 0;
21
Error *local_err = NULL;
22
- /*
23
- * Store fds/nfds returned from qio_channel_readv_full into
24
- * temporary variables.
25
- *
26
- * VhostUserMsg is a packed structure, gcc will complain about passing
27
- * pointer to a packed structure member if we pass &VhostUserMsg.fd_num
28
- * and &VhostUserMsg.fds directly when calling qio_channel_readv_full,
29
- * thus two temporary variables nfds and fds are used here.
30
- */
31
- size_t nfds = 0, nfds_t = 0;
32
const size_t max_fds = G_N_ELEMENTS(vmsg->fds);
33
- int *fds_t = NULL;
34
VuServer *server = container_of(vu_dev, VuServer, vu_dev);
35
QIOChannel *ioc = server->ioc;
36
37
+ vmsg->fd_num = 0;
38
if (!ioc) {
39
error_report_err(local_err);
40
goto fail;
41
@@ -XXX,XX +XXX,XX @@ vu_message_read(VuDev *vu_dev, int conn_fd, VhostUserMsg *vmsg)
42
43
assert(qemu_in_coroutine());
44
do {
45
+ size_t nfds = 0;
46
+ int *fds = NULL;
47
+
48
/*
49
* qio_channel_readv_full may have short reads, keeping calling it
50
* until getting VHOST_USER_HDR_SIZE or 0 bytes in total
51
*/
52
- rc = qio_channel_readv_full(ioc, &iov, 1, &fds_t, &nfds_t, &local_err);
53
+ rc = qio_channel_readv_full(ioc, &iov, 1, &fds, &nfds, &local_err);
54
if (rc < 0) {
55
if (rc == QIO_CHANNEL_ERR_BLOCK) {
56
+ assert(local_err == NULL);
57
qio_channel_yield(ioc, G_IO_IN);
58
continue;
59
} else {
60
error_report_err(local_err);
61
- return false;
62
+ goto fail;
63
}
64
}
65
- read_bytes += rc;
66
- if (nfds_t > 0) {
67
- if (nfds + nfds_t > max_fds) {
68
+
69
+ if (nfds > 0) {
70
+ if (vmsg->fd_num + nfds > max_fds) {
71
error_report("A maximum of %zu fds are allowed, "
72
"however got %zu fds now",
73
- max_fds, nfds + nfds_t);
74
+ max_fds, vmsg->fd_num + nfds);
75
+ g_free(fds);
76
goto fail;
77
}
78
- memcpy(vmsg->fds + nfds, fds_t,
79
- nfds_t *sizeof(vmsg->fds[0]));
80
- nfds += nfds_t;
81
- g_free(fds_t);
82
+ memcpy(vmsg->fds + vmsg->fd_num, fds, nfds * sizeof(vmsg->fds[0]));
83
+ vmsg->fd_num += nfds;
84
+ g_free(fds);
85
}
86
- if (read_bytes == VHOST_USER_HDR_SIZE || rc == 0) {
87
- break;
88
+
89
+ if (rc == 0) { /* socket closed */
90
+ goto fail;
91
}
92
- iov.iov_base = (char *)vmsg + read_bytes;
93
- iov.iov_len = VHOST_USER_HDR_SIZE - read_bytes;
94
- } while (true);
95
96
- vmsg->fd_num = nfds;
97
+ iov.iov_base += rc;
98
+ iov.iov_len -= rc;
99
+ read_bytes += rc;
100
+ } while (read_bytes != VHOST_USER_HDR_SIZE);
101
+
102
/* qio_channel_readv_full will make socket fds blocking, unblock them */
103
vmsg_unblock_fds(vmsg);
104
if (vmsg->size > sizeof(vmsg->payload)) {
105
--
106
2.26.2
107
diff view generated by jsdifflib
New patch
1
Unexpected EOF is an error that must be reported.
1
2
3
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
4
Message-id: 20200924151549.913737-9-stefanha@redhat.com
5
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
6
---
7
util/vhost-user-server.c | 6 ++++--
8
1 file changed, 4 insertions(+), 2 deletions(-)
9
10
diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
11
index XXXXXXX..XXXXXXX 100644
12
--- a/util/vhost-user-server.c
13
+++ b/util/vhost-user-server.c
14
@@ -XXX,XX +XXX,XX @@ vu_message_read(VuDev *vu_dev, int conn_fd, VhostUserMsg *vmsg)
15
};
16
if (vmsg->size) {
17
rc = qio_channel_readv_all_eof(ioc, &iov_payload, 1, &local_err);
18
- if (rc == -1) {
19
- error_report_err(local_err);
20
+ if (rc != 1) {
21
+ if (local_err) {
22
+ error_report_err(local_err);
23
+ }
24
goto fail;
25
}
26
}
27
--
28
2.26.2
29
diff view generated by jsdifflib
1
Fixed VHDX images cannot guarantee to be zero-initialized. If the image
1
The vu_client_trip() coroutine is leaked during AioContext switching. It
2
has the "fixed" subformat, forward the call to the underlying storage
2
is also unsafe to destroy the vu_dev in panic_cb() since its callers
3
node.
3
still access it in some cases.
4
4
5
Reported-by: Stefano Garzarella <sgarzare@redhat.com>
5
Rework the lifecycle to solve these safety issues.
6
Signed-off-by: Max Reitz <mreitz@redhat.com>
6
7
Message-id: 20190724171239.8764-9-mreitz@redhat.com
7
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
8
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
8
Message-id: 20200924151549.913737-10-stefanha@redhat.com
9
Signed-off-by: Max Reitz <mreitz@redhat.com>
9
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10
---
10
---
11
block/vhdx.c | 26 +++++++++++++++++++++++++-
11
util/vhost-user-server.h | 29 ++--
12
1 file changed, 25 insertions(+), 1 deletion(-)
12
block/export/vhost-user-blk-server.c | 9 +-
13
util/vhost-user-server.c | 245 +++++++++++++++------------
14
3 files changed, 155 insertions(+), 128 deletions(-)
13
15
14
diff --git a/block/vhdx.c b/block/vhdx.c
16
diff --git a/util/vhost-user-server.h b/util/vhost-user-server.h
15
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
16
--- a/block/vhdx.c
18
--- a/util/vhost-user-server.h
17
+++ b/block/vhdx.c
19
+++ b/util/vhost-user-server.h
18
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn vhdx_co_check(BlockDriverState *bs,
20
@@ -XXX,XX +XXX,XX @@
19
return 0;
21
#include "qapi/error.h"
20
}
22
#include "standard-headers/linux/virtio_blk.h"
21
23
22
+static int vhdx_has_zero_init(BlockDriverState *bs)
24
+/* A kick fd that we monitor on behalf of libvhost-user */
25
typedef struct VuFdWatch {
26
VuDev *vu_dev;
27
int fd; /*kick fd*/
28
void *pvt;
29
vu_watch_cb cb;
30
- bool processing;
31
QTAILQ_ENTRY(VuFdWatch) next;
32
} VuFdWatch;
33
34
-typedef struct VuServer VuServer;
35
-
36
-struct VuServer {
37
+/**
38
+ * VuServer:
39
+ * A vhost-user server instance with user-defined VuDevIface callbacks.
40
+ * Vhost-user device backends can be implemented using VuServer. VuDevIface
41
+ * callbacks and virtqueue kicks run in the given AioContext.
42
+ */
43
+typedef struct {
44
QIONetListener *listener;
45
+ QEMUBH *restart_listener_bh;
46
AioContext *ctx;
47
int max_queues;
48
const VuDevIface *vu_iface;
49
+
50
+ /* Protected by ctx lock */
51
VuDev vu_dev;
52
QIOChannel *ioc; /* The I/O channel with the client */
53
QIOChannelSocket *sioc; /* The underlying data channel with the client */
54
- /* IOChannel for fd provided via VHOST_USER_SET_SLAVE_REQ_FD */
55
- QIOChannel *ioc_slave;
56
- QIOChannelSocket *sioc_slave;
57
- Coroutine *co_trip; /* coroutine for processing VhostUserMsg */
58
QTAILQ_HEAD(, VuFdWatch) vu_fd_watches;
59
- /* restart coroutine co_trip if AIOContext is changed */
60
- bool aio_context_changed;
61
- bool processing_msg;
62
-};
63
+
64
+ Coroutine *co_trip; /* coroutine for processing VhostUserMsg */
65
+} VuServer;
66
67
bool vhost_user_server_start(VuServer *server,
68
SocketAddress *unix_socket,
69
@@ -XXX,XX +XXX,XX @@ bool vhost_user_server_start(VuServer *server,
70
71
void vhost_user_server_stop(VuServer *server);
72
73
-void vhost_user_server_set_aio_context(VuServer *server, AioContext *ctx);
74
+void vhost_user_server_attach_aio_context(VuServer *server, AioContext *ctx);
75
+void vhost_user_server_detach_aio_context(VuServer *server);
76
77
#endif /* VHOST_USER_SERVER_H */
78
diff --git a/block/export/vhost-user-blk-server.c b/block/export/vhost-user-blk-server.c
79
index XXXXXXX..XXXXXXX 100644
80
--- a/block/export/vhost-user-blk-server.c
81
+++ b/block/export/vhost-user-blk-server.c
82
@@ -XXX,XX +XXX,XX @@ static const VuDevIface vu_block_iface = {
83
static void blk_aio_attached(AioContext *ctx, void *opaque)
84
{
85
VuBlockDev *vub_dev = opaque;
86
- aio_context_acquire(ctx);
87
- vhost_user_server_set_aio_context(&vub_dev->vu_server, ctx);
88
- aio_context_release(ctx);
89
+ vhost_user_server_attach_aio_context(&vub_dev->vu_server, ctx);
90
}
91
92
static void blk_aio_detach(void *opaque)
93
{
94
VuBlockDev *vub_dev = opaque;
95
- AioContext *ctx = vub_dev->vu_server.ctx;
96
- aio_context_acquire(ctx);
97
- vhost_user_server_set_aio_context(&vub_dev->vu_server, NULL);
98
- aio_context_release(ctx);
99
+ vhost_user_server_detach_aio_context(&vub_dev->vu_server);
100
}
101
102
static void
103
diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
104
index XXXXXXX..XXXXXXX 100644
105
--- a/util/vhost-user-server.c
106
+++ b/util/vhost-user-server.c
107
@@ -XXX,XX +XXX,XX @@
108
*/
109
#include "qemu/osdep.h"
110
#include "qemu/main-loop.h"
111
+#include "block/aio-wait.h"
112
#include "vhost-user-server.h"
113
114
+/*
115
+ * Theory of operation:
116
+ *
117
+ * VuServer is started and stopped by vhost_user_server_start() and
118
+ * vhost_user_server_stop() from the main loop thread. Starting the server
119
+ * opens a vhost-user UNIX domain socket and listens for incoming connections.
120
+ * Only one connection is allowed at a time.
121
+ *
122
+ * The connection is handled by the vu_client_trip() coroutine in the
123
+ * VuServer->ctx AioContext. The coroutine consists of a vu_dispatch() loop
124
+ * where libvhost-user calls vu_message_read() to receive the next vhost-user
125
+ * protocol messages over the UNIX domain socket.
126
+ *
127
+ * When virtqueues are set up libvhost-user calls set_watch() to monitor kick
128
+ * fds. These fds are also handled in the VuServer->ctx AioContext.
129
+ *
130
+ * Both vu_client_trip() and kick fd monitoring can be stopped by shutting down
131
+ * the socket connection. Shutting down the socket connection causes
132
+ * vu_message_read() to fail since no more data can be received from the socket.
133
+ * After vu_dispatch() fails, vu_client_trip() calls vu_deinit() to stop
134
+ * libvhost-user before terminating the coroutine. vu_deinit() calls
135
+ * remove_watch() to stop monitoring kick fds and this stops virtqueue
136
+ * processing.
137
+ *
138
+ * When vu_client_trip() has finished cleaning up it schedules a BH in the main
139
+ * loop thread to accept the next client connection.
140
+ *
141
+ * When libvhost-user detects an error it calls panic_cb() and sets the
142
+ * dev->broken flag. Both vu_client_trip() and kick fd processing stop when
143
+ * the dev->broken flag is set.
144
+ *
145
+ * It is possible to switch AioContexts using
146
+ * vhost_user_server_detach_aio_context() and
147
+ * vhost_user_server_attach_aio_context(). They stop monitoring fds in the old
148
+ * AioContext and resume monitoring in the new AioContext. The vu_client_trip()
149
+ * coroutine remains in a yielded state during the switch. This is made
150
+ * possible by QIOChannel's support for spurious coroutine re-entry in
151
+ * qio_channel_yield(). The coroutine will restart I/O when re-entered from the
152
+ * new AioContext.
153
+ */
154
+
155
static void vmsg_close_fds(VhostUserMsg *vmsg)
156
{
157
int i;
158
@@ -XXX,XX +XXX,XX @@ static void vmsg_unblock_fds(VhostUserMsg *vmsg)
159
}
160
}
161
162
-static void vu_accept(QIONetListener *listener, QIOChannelSocket *sioc,
163
- gpointer opaque);
164
-
165
-static void close_client(VuServer *server)
166
-{
167
- /*
168
- * Before closing the client
169
- *
170
- * 1. Let vu_client_trip stop processing new vhost-user msg
171
- *
172
- * 2. remove kick_handler
173
- *
174
- * 3. wait for the kick handler to be finished
175
- *
176
- * 4. wait for the current vhost-user msg to be finished processing
177
- */
178
-
179
- QIOChannelSocket *sioc = server->sioc;
180
- /* When this is set vu_client_trip will stop new processing vhost-user message */
181
- server->sioc = NULL;
182
-
183
- while (server->processing_msg) {
184
- if (server->ioc->read_coroutine) {
185
- server->ioc->read_coroutine = NULL;
186
- qio_channel_set_aio_fd_handler(server->ioc, server->ioc->ctx, NULL,
187
- NULL, server->ioc);
188
- server->processing_msg = false;
189
- }
190
- }
191
-
192
- vu_deinit(&server->vu_dev);
193
-
194
- /* vu_deinit() should have called remove_watch() */
195
- assert(QTAILQ_EMPTY(&server->vu_fd_watches));
196
-
197
- object_unref(OBJECT(sioc));
198
- object_unref(OBJECT(server->ioc));
199
-}
200
-
201
static void panic_cb(VuDev *vu_dev, const char *buf)
202
{
203
- VuServer *server = container_of(vu_dev, VuServer, vu_dev);
204
-
205
- /* avoid while loop in close_client */
206
- server->processing_msg = false;
207
-
208
- if (buf) {
209
- error_report("vu_panic: %s", buf);
210
- }
211
-
212
- if (server->sioc) {
213
- close_client(server);
214
- }
215
-
216
- /*
217
- * Set the callback function for network listener so another
218
- * vhost-user client can connect to this server
219
- */
220
- qio_net_listener_set_client_func(server->listener,
221
- vu_accept,
222
- server,
223
- NULL);
224
+ error_report("vu_panic: %s", buf);
225
}
226
227
static bool coroutine_fn
228
@@ -XXX,XX +XXX,XX @@ fail:
229
return false;
230
}
231
232
-
233
-static void vu_client_start(VuServer *server);
234
static coroutine_fn void vu_client_trip(void *opaque)
235
{
236
VuServer *server = opaque;
237
+ VuDev *vu_dev = &server->vu_dev;
238
239
- while (!server->aio_context_changed && server->sioc) {
240
- server->processing_msg = true;
241
- vu_dispatch(&server->vu_dev);
242
- server->processing_msg = false;
243
+ while (!vu_dev->broken && vu_dispatch(vu_dev)) {
244
+ /* Keep running */
245
}
246
247
- if (server->aio_context_changed && server->sioc) {
248
- server->aio_context_changed = false;
249
- vu_client_start(server);
250
- }
251
-}
252
+ vu_deinit(vu_dev);
253
+
254
+ /* vu_deinit() should have called remove_watch() */
255
+ assert(QTAILQ_EMPTY(&server->vu_fd_watches));
256
+
257
+ object_unref(OBJECT(server->sioc));
258
+ server->sioc = NULL;
259
260
-static void vu_client_start(VuServer *server)
261
-{
262
- server->co_trip = qemu_coroutine_create(vu_client_trip, server);
263
- aio_co_enter(server->ctx, server->co_trip);
264
+ object_unref(OBJECT(server->ioc));
265
+ server->ioc = NULL;
266
+
267
+ server->co_trip = NULL;
268
+ if (server->restart_listener_bh) {
269
+ qemu_bh_schedule(server->restart_listener_bh);
270
+ }
271
+ aio_wait_kick();
272
}
273
274
/*
275
@@ -XXX,XX +XXX,XX @@ static void vu_client_start(VuServer *server)
276
static void kick_handler(void *opaque)
277
{
278
VuFdWatch *vu_fd_watch = opaque;
279
- vu_fd_watch->processing = true;
280
- vu_fd_watch->cb(vu_fd_watch->vu_dev, 0, vu_fd_watch->pvt);
281
- vu_fd_watch->processing = false;
282
+ VuDev *vu_dev = vu_fd_watch->vu_dev;
283
+
284
+ vu_fd_watch->cb(vu_dev, 0, vu_fd_watch->pvt);
285
+
286
+ /* Stop vu_client_trip() if an error occurred in vu_fd_watch->cb() */
287
+ if (vu_dev->broken) {
288
+ VuServer *server = container_of(vu_dev, VuServer, vu_dev);
289
+
290
+ qio_channel_shutdown(server->ioc, QIO_CHANNEL_SHUTDOWN_BOTH, NULL);
291
+ }
292
}
293
294
-
295
static VuFdWatch *find_vu_fd_watch(VuServer *server, int fd)
296
{
297
298
@@ -XXX,XX +XXX,XX @@ static void vu_accept(QIONetListener *listener, QIOChannelSocket *sioc,
299
qio_channel_set_name(QIO_CHANNEL(sioc), "vhost-user client");
300
server->ioc = QIO_CHANNEL(sioc);
301
object_ref(OBJECT(server->ioc));
302
- qio_channel_attach_aio_context(server->ioc, server->ctx);
303
+
304
+ /* TODO vu_message_write() spins if non-blocking! */
305
qio_channel_set_blocking(server->ioc, false, NULL);
306
- vu_client_start(server);
307
+
308
+ server->co_trip = qemu_coroutine_create(vu_client_trip, server);
309
+
310
+ aio_context_acquire(server->ctx);
311
+ vhost_user_server_attach_aio_context(server, server->ctx);
312
+ aio_context_release(server->ctx);
313
}
314
315
-
316
void vhost_user_server_stop(VuServer *server)
317
{
318
+ aio_context_acquire(server->ctx);
319
+
320
+ qemu_bh_delete(server->restart_listener_bh);
321
+ server->restart_listener_bh = NULL;
322
+
323
if (server->sioc) {
324
- close_client(server);
325
+ VuFdWatch *vu_fd_watch;
326
+
327
+ QTAILQ_FOREACH(vu_fd_watch, &server->vu_fd_watches, next) {
328
+ aio_set_fd_handler(server->ctx, vu_fd_watch->fd, true,
329
+ NULL, NULL, NULL, vu_fd_watch);
330
+ }
331
+
332
+ qio_channel_shutdown(server->ioc, QIO_CHANNEL_SHUTDOWN_BOTH, NULL);
333
+
334
+ AIO_WAIT_WHILE(server->ctx, server->co_trip);
335
}
336
337
+ aio_context_release(server->ctx);
338
+
339
if (server->listener) {
340
qio_net_listener_disconnect(server->listener);
341
object_unref(OBJECT(server->listener));
342
}
343
+}
344
+
345
+/*
346
+ * Allow the next client to connect to the server. Called from a BH in the main
347
+ * loop.
348
+ */
349
+static void restart_listener_bh(void *opaque)
23
+{
350
+{
24
+ BDRVVHDXState *s = bs->opaque;
351
+ VuServer *server = opaque;
25
+ int state;
352
26
+
353
+ qio_net_listener_set_client_func(server->listener, vu_accept, server,
27
+ /*
354
+ NULL);
28
+ * Check the subformat: Fixed images have all BAT entries present,
355
}
29
+ * dynamic images have none (right after creation). It is
356
30
+ * therefore enough to check the first BAT entry.
357
-void vhost_user_server_set_aio_context(VuServer *server, AioContext *ctx)
31
+ */
358
+/* Called with ctx acquired */
32
+ if (!s->bat_entries) {
359
+void vhost_user_server_attach_aio_context(VuServer *server, AioContext *ctx)
33
+ return 1;
360
{
361
- VuFdWatch *vu_fd_watch, *next;
362
- void *opaque = NULL;
363
- IOHandler *io_read = NULL;
364
- bool attach;
365
+ VuFdWatch *vu_fd_watch;
366
367
- server->ctx = ctx ? ctx : qemu_get_aio_context();
368
+ server->ctx = ctx;
369
370
if (!server->sioc) {
371
- /* not yet serving any client*/
372
return;
373
}
374
375
- if (ctx) {
376
- qio_channel_attach_aio_context(server->ioc, ctx);
377
- server->aio_context_changed = true;
378
- io_read = kick_handler;
379
- attach = true;
380
- } else {
381
+ qio_channel_attach_aio_context(server->ioc, ctx);
382
+
383
+ QTAILQ_FOREACH(vu_fd_watch, &server->vu_fd_watches, next) {
384
+ aio_set_fd_handler(ctx, vu_fd_watch->fd, true, kick_handler, NULL,
385
+ NULL, vu_fd_watch);
34
+ }
386
+ }
35
+
387
+
36
+ state = s->bat[0] & VHDX_BAT_STATE_BIT_MASK;
388
+ aio_co_schedule(ctx, server->co_trip);
37
+ if (state == PAYLOAD_BLOCK_FULLY_PRESENT) {
38
+ /* Fixed subformat */
39
+ return bdrv_has_zero_init(bs->file->bs);
40
+ }
41
+
42
+ /* Dynamic subformat */
43
+ return 1;
44
+}
389
+}
45
+
390
+
46
static QemuOptsList vhdx_create_opts = {
391
+/* Called with server->ctx acquired */
47
.name = "vhdx-create-opts",
392
+void vhost_user_server_detach_aio_context(VuServer *server)
48
.head = QTAILQ_HEAD_INITIALIZER(vhdx_create_opts.head),
393
+{
49
@@ -XXX,XX +XXX,XX @@ static BlockDriver bdrv_vhdx = {
394
+ if (server->sioc) {
50
.bdrv_co_create_opts = vhdx_co_create_opts,
395
+ VuFdWatch *vu_fd_watch;
51
.bdrv_get_info = vhdx_get_info,
396
+
52
.bdrv_co_check = vhdx_co_check,
397
+ QTAILQ_FOREACH(vu_fd_watch, &server->vu_fd_watches, next) {
53
- .bdrv_has_zero_init = bdrv_has_zero_init_1,
398
+ aio_set_fd_handler(server->ctx, vu_fd_watch->fd, true,
54
+ .bdrv_has_zero_init = vhdx_has_zero_init,
399
+ NULL, NULL, NULL, vu_fd_watch);
55
400
+ }
56
.create_opts = &vhdx_create_opts,
401
+
57
};
402
qio_channel_detach_aio_context(server->ioc);
403
- /* server->ioc->ctx keeps the old AioConext */
404
- ctx = server->ioc->ctx;
405
- attach = false;
406
}
407
408
- QTAILQ_FOREACH_SAFE(vu_fd_watch, &server->vu_fd_watches, next, next) {
409
- if (vu_fd_watch->cb) {
410
- opaque = attach ? vu_fd_watch : NULL;
411
- aio_set_fd_handler(ctx, vu_fd_watch->fd, true,
412
- io_read, NULL, NULL,
413
- opaque);
414
- }
415
- }
416
+ server->ctx = NULL;
417
}
418
419
-
420
bool vhost_user_server_start(VuServer *server,
421
SocketAddress *socket_addr,
422
AioContext *ctx,
423
@@ -XXX,XX +XXX,XX @@ bool vhost_user_server_start(VuServer *server,
424
const VuDevIface *vu_iface,
425
Error **errp)
426
{
427
+ QEMUBH *bh;
428
QIONetListener *listener = qio_net_listener_new();
429
if (qio_net_listener_open_sync(listener, socket_addr, 1,
430
errp) < 0) {
431
@@ -XXX,XX +XXX,XX @@ bool vhost_user_server_start(VuServer *server,
432
return false;
433
}
434
435
+ bh = qemu_bh_new(restart_listener_bh, server);
436
+
437
/* zero out unspecified fields */
438
*server = (VuServer) {
439
.listener = listener,
440
+ .restart_listener_bh = bh,
441
.vu_iface = vu_iface,
442
.max_queues = max_queues,
443
.ctx = ctx,
58
--
444
--
59
2.21.0
445
2.26.2
60
446
61
diff view generated by jsdifflib
New patch
1
Propagate the flush return value since errors are possible.
1
2
3
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
4
Message-id: 20200924151549.913737-11-stefanha@redhat.com
5
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
6
---
7
block/export/vhost-user-blk-server.c | 11 +++++++----
8
1 file changed, 7 insertions(+), 4 deletions(-)
9
10
diff --git a/block/export/vhost-user-blk-server.c b/block/export/vhost-user-blk-server.c
11
index XXXXXXX..XXXXXXX 100644
12
--- a/block/export/vhost-user-blk-server.c
13
+++ b/block/export/vhost-user-blk-server.c
14
@@ -XXX,XX +XXX,XX @@ vu_block_discard_write_zeroes(VuBlockReq *req, struct iovec *iov,
15
return -EINVAL;
16
}
17
18
-static void coroutine_fn vu_block_flush(VuBlockReq *req)
19
+static int coroutine_fn vu_block_flush(VuBlockReq *req)
20
{
21
VuBlockDev *vdev_blk = get_vu_block_device_by_server(req->server);
22
BlockBackend *backend = vdev_blk->backend;
23
- blk_co_flush(backend);
24
+ return blk_co_flush(backend);
25
}
26
27
static void coroutine_fn vu_block_virtio_process_req(void *opaque)
28
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn vu_block_virtio_process_req(void *opaque)
29
break;
30
}
31
case VIRTIO_BLK_T_FLUSH:
32
- vu_block_flush(req);
33
- req->in->status = VIRTIO_BLK_S_OK;
34
+ if (vu_block_flush(req) == 0) {
35
+ req->in->status = VIRTIO_BLK_S_OK;
36
+ } else {
37
+ req->in->status = VIRTIO_BLK_S_IOERR;
38
+ }
39
break;
40
case VIRTIO_BLK_T_GET_ID: {
41
size_t size = MIN(iov_size(&elem->in_sg[0], in_num),
42
--
43
2.26.2
44
diff view generated by jsdifflib
1
From: Maxim Levitsky <mlevitsk@redhat.com>
1
Use the new QAPI block exports API instead of defining our own QOM
2
2
objects.
3
preallocation=off and preallocation=metadata
3
4
both allocate luks header only, and preallocation=falloc/full
4
This is a large change because the lifecycle of VuBlockDev needs to
5
is passed to underlying file.
5
follow BlockExportDriver. QOM properties are replaced by QAPI options
6
6
objects.
7
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1534951
7
8
8
VuBlockDev is renamed VuBlkExport and contains a BlockExport field.
9
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
9
Several fields can be dropped since BlockExport already has equivalents.
10
Message-id: 20190716161901.1430-1-mlevitsk@redhat.com
10
11
Signed-off-by: Max Reitz <mreitz@redhat.com>
11
The file names and meson build integration will be adjusted in a future
12
patch. libvhost-user should probably be built as a static library that
13
is linked into QEMU instead of as a .c file that results in duplicate
14
compilation.
15
16
The new command-line syntax is:
17
18
$ qemu-storage-daemon \
19
--blockdev file,node-name=drive0,filename=test.img \
20
--export vhost-user-blk,node-name=drive0,id=export0,unix-socket=/tmp/vhost-user-blk.sock
21
22
Note that unix-socket is optional because we may wish to accept chardevs
23
too in the future.
24
25
Markus noted that supported address families are not explicit in the
26
QAPI schema. It is unlikely that support for more address families will
27
be added since file descriptor passing is required and few address
28
families support it. If a new address family needs to be added, then the
29
QAPI 'features' syntax can be used to advertize them.
30
31
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
32
Acked-by: Markus Armbruster <armbru@redhat.com>
33
Message-id: 20200924151549.913737-12-stefanha@redhat.com
34
[Skip test on big-endian host architectures because this device doesn't
35
support them yet (as already mentioned in a code comment).
36
--Stefan]
37
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
12
---
38
---
13
qapi/block-core.json | 6 +++++-
39
qapi/block-export.json | 21 +-
14
block/crypto.c | 30 +++++++++++++++++++++++++++---
40
block/export/vhost-user-blk-server.h | 23 +-
15
2 files changed, 32 insertions(+), 4 deletions(-)
41
block/export/export.c | 6 +
16
42
block/export/vhost-user-blk-server.c | 452 +++++++--------------------
17
diff --git a/qapi/block-core.json b/qapi/block-core.json
43
util/vhost-user-server.c | 10 +-
44
block/export/meson.build | 1 +
45
block/meson.build | 1 -
46
7 files changed, 156 insertions(+), 358 deletions(-)
47
48
diff --git a/qapi/block-export.json b/qapi/block-export.json
18
index XXXXXXX..XXXXXXX 100644
49
index XXXXXXX..XXXXXXX 100644
19
--- a/qapi/block-core.json
50
--- a/qapi/block-export.json
20
+++ b/qapi/block-core.json
51
+++ b/qapi/block-export.json
21
@@ -XXX,XX +XXX,XX @@
52
@@ -XXX,XX +XXX,XX @@
53
'data': { '*name': 'str', '*description': 'str',
54
'*bitmap': 'str' } }
55
56
+##
57
+# @BlockExportOptionsVhostUserBlk:
58
+#
59
+# A vhost-user-blk block export.
60
+#
61
+# @addr: The vhost-user socket on which to listen. Both 'unix' and 'fd'
62
+# SocketAddress types are supported. Passed fds must be UNIX domain
63
+# sockets.
64
+# @logical-block-size: Logical block size in bytes. Defaults to 512 bytes.
65
+#
66
+# Since: 5.2
67
+##
68
+{ 'struct': 'BlockExportOptionsVhostUserBlk',
69
+ 'data': { 'addr': 'SocketAddress', '*logical-block-size': 'size' } }
70
+
71
##
72
# @NbdServerAddOptions:
22
#
73
#
23
# @file Node to create the image format on
74
@@ -XXX,XX +XXX,XX @@
24
# @size Size of the virtual disk in bytes
75
# An enumeration of block export types
25
+# @preallocation Preallocation mode for the new image
26
+# (since: 4.2)
27
+# (default: off; allowed values: off, metadata, falloc, full)
28
#
76
#
29
# Since: 2.12
77
# @nbd: NBD export
78
+# @vhost-user-blk: vhost-user-blk export (since 5.2)
79
#
80
# Since: 4.2
30
##
81
##
31
{ 'struct': 'BlockdevCreateOptionsLUKS',
82
{ 'enum': 'BlockExportType',
32
'base': 'QCryptoBlockCreateOptionsLUKS',
83
- 'data': [ 'nbd' ] }
33
'data': { 'file': 'BlockdevRef',
84
+ 'data': [ 'nbd', 'vhost-user-blk' ] }
34
- 'size': 'size' } }
35
+ 'size': 'size',
36
+ '*preallocation': 'PreallocMode' } }
37
85
38
##
86
##
39
# @BlockdevCreateOptionsNfs:
87
# @BlockExportOptions:
40
diff --git a/block/crypto.c b/block/crypto.c
88
@@ -XXX,XX +XXX,XX @@
89
'*writethrough': 'bool' },
90
'discriminator': 'type',
91
'data': {
92
- 'nbd': 'BlockExportOptionsNbd'
93
+ 'nbd': 'BlockExportOptionsNbd',
94
+ 'vhost-user-blk': 'BlockExportOptionsVhostUserBlk'
95
} }
96
97
##
98
diff --git a/block/export/vhost-user-blk-server.h b/block/export/vhost-user-blk-server.h
41
index XXXXXXX..XXXXXXX 100644
99
index XXXXXXX..XXXXXXX 100644
42
--- a/block/crypto.c
100
--- a/block/export/vhost-user-blk-server.h
43
+++ b/block/crypto.c
101
+++ b/block/export/vhost-user-blk-server.h
44
@@ -XXX,XX +XXX,XX @@ static ssize_t block_crypto_read_func(QCryptoBlock *block,
102
@@ -XXX,XX +XXX,XX @@
45
struct BlockCryptoCreateData {
103
46
BlockBackend *blk;
104
#ifndef VHOST_USER_BLK_SERVER_H
47
uint64_t size;
105
#define VHOST_USER_BLK_SERVER_H
48
+ PreallocMode prealloc;
106
-#include "util/vhost-user-server.h"
107
108
-typedef struct VuBlockDev VuBlockDev;
109
-#define TYPE_VHOST_USER_BLK_SERVER "vhost-user-blk-server"
110
-#define VHOST_USER_BLK_SERVER(obj) \
111
- OBJECT_CHECK(VuBlockDev, obj, TYPE_VHOST_USER_BLK_SERVER)
112
+#include "block/export.h"
113
114
-/* vhost user block device */
115
-struct VuBlockDev {
116
- Object parent_obj;
117
- char *node_name;
118
- SocketAddress *addr;
119
- AioContext *ctx;
120
- VuServer vu_server;
121
- bool running;
122
- uint32_t blk_size;
123
- BlockBackend *backend;
124
- QIOChannelSocket *sioc;
125
- QTAILQ_ENTRY(VuBlockDev) next;
126
- struct virtio_blk_config blkcfg;
127
- bool writable;
128
-};
129
+/* For block/export/export.c */
130
+extern const BlockExportDriver blk_exp_vhost_user_blk;
131
132
#endif /* VHOST_USER_BLK_SERVER_H */
133
diff --git a/block/export/export.c b/block/export/export.c
134
index XXXXXXX..XXXXXXX 100644
135
--- a/block/export/export.c
136
+++ b/block/export/export.c
137
@@ -XXX,XX +XXX,XX @@
138
#include "sysemu/block-backend.h"
139
#include "block/export.h"
140
#include "block/nbd.h"
141
+#if CONFIG_LINUX
142
+#include "block/export/vhost-user-blk-server.h"
143
+#endif
144
#include "qapi/error.h"
145
#include "qapi/qapi-commands-block-export.h"
146
#include "qapi/qapi-events-block-export.h"
147
@@ -XXX,XX +XXX,XX @@
148
149
static const BlockExportDriver *blk_exp_drivers[] = {
150
&blk_exp_nbd,
151
+#if CONFIG_LINUX
152
+ &blk_exp_vhost_user_blk,
153
+#endif
49
};
154
};
50
155
51
156
/* Only accessed from the main thread */
52
@@ -XXX,XX +XXX,XX @@ static ssize_t block_crypto_init_func(QCryptoBlock *block,
157
diff --git a/block/export/vhost-user-blk-server.c b/block/export/vhost-user-blk-server.c
53
* available to the guest, so we must take account of that
158
index XXXXXXX..XXXXXXX 100644
54
* which will be used by the crypto header
159
--- a/block/export/vhost-user-blk-server.c
55
*/
160
+++ b/block/export/vhost-user-blk-server.c
56
- return blk_truncate(data->blk, data->size + headerlen, PREALLOC_MODE_OFF,
161
@@ -XXX,XX +XXX,XX @@
57
+ return blk_truncate(data->blk, data->size + headerlen, data->prealloc,
162
*/
58
errp);
163
#include "qemu/osdep.h"
59
}
164
#include "block/block.h"
60
165
+#include "contrib/libvhost-user/libvhost-user.h"
61
@@ -XXX,XX +XXX,XX @@ static int block_crypto_open_generic(QCryptoBlockFormat format,
166
+#include "standard-headers/linux/virtio_blk.h"
62
static int block_crypto_co_create_generic(BlockDriverState *bs,
167
+#include "util/vhost-user-server.h"
63
int64_t size,
168
#include "vhost-user-blk-server.h"
64
QCryptoBlockCreateOptions *opts,
169
#include "qapi/error.h"
65
+ PreallocMode prealloc,
170
#include "qom/object_interfaces.h"
66
Error **errp)
171
@@ -XXX,XX +XXX,XX @@ struct virtio_blk_inhdr {
67
{
172
unsigned char status;
68
int ret;
173
};
69
@@ -XXX,XX +XXX,XX @@ static int block_crypto_co_create_generic(BlockDriverState *bs,
174
70
goto cleanup;
175
-typedef struct VuBlockReq {
176
+typedef struct VuBlkReq {
177
VuVirtqElement elem;
178
int64_t sector_num;
179
size_t size;
180
@@ -XXX,XX +XXX,XX @@ typedef struct VuBlockReq {
181
struct virtio_blk_outhdr out;
182
VuServer *server;
183
struct VuVirtq *vq;
184
-} VuBlockReq;
185
+} VuBlkReq;
186
187
-static void vu_block_req_complete(VuBlockReq *req)
188
+/* vhost user block device */
189
+typedef struct {
190
+ BlockExport export;
191
+ VuServer vu_server;
192
+ uint32_t blk_size;
193
+ QIOChannelSocket *sioc;
194
+ struct virtio_blk_config blkcfg;
195
+ bool writable;
196
+} VuBlkExport;
197
+
198
+static void vu_blk_req_complete(VuBlkReq *req)
199
{
200
VuDev *vu_dev = &req->server->vu_dev;
201
202
@@ -XXX,XX +XXX,XX @@ static void vu_block_req_complete(VuBlockReq *req)
203
free(req);
204
}
205
206
-static VuBlockDev *get_vu_block_device_by_server(VuServer *server)
207
-{
208
- return container_of(server, VuBlockDev, vu_server);
209
-}
210
-
211
static int coroutine_fn
212
-vu_block_discard_write_zeroes(VuBlockReq *req, struct iovec *iov,
213
- uint32_t iovcnt, uint32_t type)
214
+vu_blk_discard_write_zeroes(BlockBackend *blk, struct iovec *iov,
215
+ uint32_t iovcnt, uint32_t type)
216
{
217
struct virtio_blk_discard_write_zeroes desc;
218
ssize_t size = iov_to_buf(iov, iovcnt, 0, &desc, sizeof(desc));
219
@@ -XXX,XX +XXX,XX @@ vu_block_discard_write_zeroes(VuBlockReq *req, struct iovec *iov,
220
return -EINVAL;
71
}
221
}
72
222
73
+ if (prealloc == PREALLOC_MODE_METADATA) {
223
- VuBlockDev *vdev_blk = get_vu_block_device_by_server(req->server);
74
+ prealloc = PREALLOC_MODE_OFF;
224
uint64_t range[2] = { le64_to_cpu(desc.sector) << 9,
75
+ }
225
le32_to_cpu(desc.num_sectors) << 9 };
226
if (type == VIRTIO_BLK_T_DISCARD) {
227
- if (blk_co_pdiscard(vdev_blk->backend, range[0], range[1]) == 0) {
228
+ if (blk_co_pdiscard(blk, range[0], range[1]) == 0) {
229
return 0;
230
}
231
} else if (type == VIRTIO_BLK_T_WRITE_ZEROES) {
232
- if (blk_co_pwrite_zeroes(vdev_blk->backend,
233
- range[0], range[1], 0) == 0) {
234
+ if (blk_co_pwrite_zeroes(blk, range[0], range[1], 0) == 0) {
235
return 0;
236
}
237
}
238
@@ -XXX,XX +XXX,XX @@ vu_block_discard_write_zeroes(VuBlockReq *req, struct iovec *iov,
239
return -EINVAL;
240
}
241
242
-static int coroutine_fn vu_block_flush(VuBlockReq *req)
243
+static void coroutine_fn vu_blk_virtio_process_req(void *opaque)
244
{
245
- VuBlockDev *vdev_blk = get_vu_block_device_by_server(req->server);
246
- BlockBackend *backend = vdev_blk->backend;
247
- return blk_co_flush(backend);
248
-}
249
-
250
-static void coroutine_fn vu_block_virtio_process_req(void *opaque)
251
-{
252
- VuBlockReq *req = opaque;
253
+ VuBlkReq *req = opaque;
254
VuServer *server = req->server;
255
VuVirtqElement *elem = &req->elem;
256
uint32_t type;
257
258
- VuBlockDev *vdev_blk = get_vu_block_device_by_server(server);
259
- BlockBackend *backend = vdev_blk->backend;
260
+ VuBlkExport *vexp = container_of(server, VuBlkExport, vu_server);
261
+ BlockBackend *blk = vexp->export.blk;
262
263
struct iovec *in_iov = elem->in_sg;
264
struct iovec *out_iov = elem->out_sg;
265
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn vu_block_virtio_process_req(void *opaque)
266
bool is_write = type & VIRTIO_BLK_T_OUT;
267
req->sector_num = le64_to_cpu(req->out.sector);
268
269
- int64_t offset = req->sector_num * vdev_blk->blk_size;
270
+ if (is_write && !vexp->writable) {
271
+ req->in->status = VIRTIO_BLK_S_IOERR;
272
+ break;
273
+ }
76
+
274
+
77
data = (struct BlockCryptoCreateData) {
275
+ int64_t offset = req->sector_num * vexp->blk_size;
78
.blk = blk,
276
QEMUIOVector qiov;
79
.size = size,
277
if (is_write) {
80
+ .prealloc = prealloc,
278
qemu_iovec_init_external(&qiov, out_iov, out_num);
81
};
279
- ret = blk_co_pwritev(backend, offset, qiov.size,
82
280
- &qiov, 0);
83
crypto = qcrypto_block_create(opts, NULL,
281
+ ret = blk_co_pwritev(blk, offset, qiov.size, &qiov, 0);
84
@@ -XXX,XX +XXX,XX @@ block_crypto_co_create_luks(BlockdevCreateOptions *create_options, Error **errp)
282
} else {
85
BlockdevCreateOptionsLUKS *luks_opts;
283
qemu_iovec_init_external(&qiov, in_iov, in_num);
86
BlockDriverState *bs = NULL;
284
- ret = blk_co_preadv(backend, offset, qiov.size,
87
QCryptoBlockCreateOptions create_opts;
285
- &qiov, 0);
88
+ PreallocMode preallocation = PREALLOC_MODE_OFF;
286
+ ret = blk_co_preadv(blk, offset, qiov.size, &qiov, 0);
89
int ret;
287
}
90
288
if (ret >= 0) {
91
assert(create_options->driver == BLOCKDEV_DRIVER_LUKS);
289
req->in->status = VIRTIO_BLK_S_OK;
92
@@ -XXX,XX +XXX,XX @@ block_crypto_co_create_luks(BlockdevCreateOptions *create_options, Error **errp)
290
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn vu_block_virtio_process_req(void *opaque)
93
.u.luks = *qapi_BlockdevCreateOptionsLUKS_base(luks_opts),
291
break;
94
};
292
}
95
293
case VIRTIO_BLK_T_FLUSH:
96
+ if (luks_opts->has_preallocation) {
294
- if (vu_block_flush(req) == 0) {
97
+ preallocation = luks_opts->preallocation;
295
+ if (blk_co_flush(blk) == 0) {
98
+ }
296
req->in->status = VIRTIO_BLK_S_OK;
297
} else {
298
req->in->status = VIRTIO_BLK_S_IOERR;
299
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn vu_block_virtio_process_req(void *opaque)
300
case VIRTIO_BLK_T_DISCARD:
301
case VIRTIO_BLK_T_WRITE_ZEROES: {
302
int rc;
303
- rc = vu_block_discard_write_zeroes(req, &elem->out_sg[1],
304
- out_num, type);
99
+
305
+
100
ret = block_crypto_co_create_generic(bs, luks_opts->size, &create_opts,
306
+ if (!vexp->writable) {
101
- errp);
307
+ req->in->status = VIRTIO_BLK_S_IOERR;
102
+ preallocation, errp);
308
+ break;
103
if (ret < 0) {
309
+ }
104
goto fail;
310
+
311
+ rc = vu_blk_discard_write_zeroes(blk, &elem->out_sg[1], out_num, type);
312
if (rc == 0) {
313
req->in->status = VIRTIO_BLK_S_OK;
314
} else {
315
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn vu_block_virtio_process_req(void *opaque)
316
break;
105
}
317
}
106
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn block_crypto_co_create_opts_luks(const char *filename,
318
107
QCryptoBlockCreateOptions *create_opts = NULL;
319
- vu_block_req_complete(req);
108
BlockDriverState *bs = NULL;
320
+ vu_blk_req_complete(req);
109
QDict *cryptoopts;
321
return;
110
+ PreallocMode prealloc;
322
111
+ char *buf = NULL;
323
err:
112
int64_t size;
324
- free(elem);
113
int ret;
325
+ free(req);
114
+ Error *local_err = NULL;
326
}
115
327
116
/* Parse options */
328
-static void vu_block_process_vq(VuDev *vu_dev, int idx)
117
size = qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0);
329
+static void vu_blk_process_vq(VuDev *vu_dev, int idx)
118
330
{
119
+ buf = qemu_opt_get_del(opts, BLOCK_OPT_PREALLOC);
331
VuServer *server = container_of(vu_dev, VuServer, vu_dev);
120
+ prealloc = qapi_enum_parse(&PreallocMode_lookup, buf,
332
VuVirtq *vq = vu_get_queue(vu_dev, idx);
121
+ PREALLOC_MODE_OFF, &local_err);
333
122
+ g_free(buf);
334
while (1) {
123
+ if (local_err) {
335
- VuBlockReq *req;
336
+ VuBlkReq *req;
337
338
- req = vu_queue_pop(vu_dev, vq, sizeof(VuBlockReq));
339
+ req = vu_queue_pop(vu_dev, vq, sizeof(VuBlkReq));
340
if (!req) {
341
break;
342
}
343
@@ -XXX,XX +XXX,XX @@ static void vu_block_process_vq(VuDev *vu_dev, int idx)
344
req->vq = vq;
345
346
Coroutine *co =
347
- qemu_coroutine_create(vu_block_virtio_process_req, req);
348
+ qemu_coroutine_create(vu_blk_virtio_process_req, req);
349
qemu_coroutine_enter(co);
350
}
351
}
352
353
-static void vu_block_queue_set_started(VuDev *vu_dev, int idx, bool started)
354
+static void vu_blk_queue_set_started(VuDev *vu_dev, int idx, bool started)
355
{
356
VuVirtq *vq;
357
358
assert(vu_dev);
359
360
vq = vu_get_queue(vu_dev, idx);
361
- vu_set_queue_handler(vu_dev, vq, started ? vu_block_process_vq : NULL);
362
+ vu_set_queue_handler(vu_dev, vq, started ? vu_blk_process_vq : NULL);
363
}
364
365
-static uint64_t vu_block_get_features(VuDev *dev)
366
+static uint64_t vu_blk_get_features(VuDev *dev)
367
{
368
uint64_t features;
369
VuServer *server = container_of(dev, VuServer, vu_dev);
370
- VuBlockDev *vdev_blk = get_vu_block_device_by_server(server);
371
+ VuBlkExport *vexp = container_of(server, VuBlkExport, vu_server);
372
features = 1ull << VIRTIO_BLK_F_SIZE_MAX |
373
1ull << VIRTIO_BLK_F_SEG_MAX |
374
1ull << VIRTIO_BLK_F_TOPOLOGY |
375
@@ -XXX,XX +XXX,XX @@ static uint64_t vu_block_get_features(VuDev *dev)
376
1ull << VIRTIO_RING_F_EVENT_IDX |
377
1ull << VHOST_USER_F_PROTOCOL_FEATURES;
378
379
- if (!vdev_blk->writable) {
380
+ if (!vexp->writable) {
381
features |= 1ull << VIRTIO_BLK_F_RO;
382
}
383
384
return features;
385
}
386
387
-static uint64_t vu_block_get_protocol_features(VuDev *dev)
388
+static uint64_t vu_blk_get_protocol_features(VuDev *dev)
389
{
390
return 1ull << VHOST_USER_PROTOCOL_F_CONFIG |
391
1ull << VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD;
392
}
393
394
static int
395
-vu_block_get_config(VuDev *vu_dev, uint8_t *config, uint32_t len)
396
+vu_blk_get_config(VuDev *vu_dev, uint8_t *config, uint32_t len)
397
{
398
+ /* TODO blkcfg must be little-endian for VIRTIO 1.0 */
399
VuServer *server = container_of(vu_dev, VuServer, vu_dev);
400
- VuBlockDev *vdev_blk = get_vu_block_device_by_server(server);
401
- memcpy(config, &vdev_blk->blkcfg, len);
402
-
403
+ VuBlkExport *vexp = container_of(server, VuBlkExport, vu_server);
404
+ memcpy(config, &vexp->blkcfg, len);
405
return 0;
406
}
407
408
static int
409
-vu_block_set_config(VuDev *vu_dev, const uint8_t *data,
410
+vu_blk_set_config(VuDev *vu_dev, const uint8_t *data,
411
uint32_t offset, uint32_t size, uint32_t flags)
412
{
413
VuServer *server = container_of(vu_dev, VuServer, vu_dev);
414
- VuBlockDev *vdev_blk = get_vu_block_device_by_server(server);
415
+ VuBlkExport *vexp = container_of(server, VuBlkExport, vu_server);
416
uint8_t wce;
417
418
/* don't support live migration */
419
@@ -XXX,XX +XXX,XX @@ vu_block_set_config(VuDev *vu_dev, const uint8_t *data,
420
}
421
422
wce = *data;
423
- vdev_blk->blkcfg.wce = wce;
424
- blk_set_enable_write_cache(vdev_blk->backend, wce);
425
+ vexp->blkcfg.wce = wce;
426
+ blk_set_enable_write_cache(vexp->export.blk, wce);
427
return 0;
428
}
429
430
@@ -XXX,XX +XXX,XX @@ vu_block_set_config(VuDev *vu_dev, const uint8_t *data,
431
* of vu_process_message.
432
*
433
*/
434
-static int vu_block_process_msg(VuDev *dev, VhostUserMsg *vmsg, int *do_reply)
435
+static int vu_blk_process_msg(VuDev *dev, VhostUserMsg *vmsg, int *do_reply)
436
{
437
if (vmsg->request == VHOST_USER_NONE) {
438
dev->panic(dev, "disconnect");
439
@@ -XXX,XX +XXX,XX @@ static int vu_block_process_msg(VuDev *dev, VhostUserMsg *vmsg, int *do_reply)
440
return false;
441
}
442
443
-static const VuDevIface vu_block_iface = {
444
- .get_features = vu_block_get_features,
445
- .queue_set_started = vu_block_queue_set_started,
446
- .get_protocol_features = vu_block_get_protocol_features,
447
- .get_config = vu_block_get_config,
448
- .set_config = vu_block_set_config,
449
- .process_msg = vu_block_process_msg,
450
+static const VuDevIface vu_blk_iface = {
451
+ .get_features = vu_blk_get_features,
452
+ .queue_set_started = vu_blk_queue_set_started,
453
+ .get_protocol_features = vu_blk_get_protocol_features,
454
+ .get_config = vu_blk_get_config,
455
+ .set_config = vu_blk_set_config,
456
+ .process_msg = vu_blk_process_msg,
457
};
458
459
static void blk_aio_attached(AioContext *ctx, void *opaque)
460
{
461
- VuBlockDev *vub_dev = opaque;
462
- vhost_user_server_attach_aio_context(&vub_dev->vu_server, ctx);
463
+ VuBlkExport *vexp = opaque;
464
+ vhost_user_server_attach_aio_context(&vexp->vu_server, ctx);
465
}
466
467
static void blk_aio_detach(void *opaque)
468
{
469
- VuBlockDev *vub_dev = opaque;
470
- vhost_user_server_detach_aio_context(&vub_dev->vu_server);
471
+ VuBlkExport *vexp = opaque;
472
+ vhost_user_server_detach_aio_context(&vexp->vu_server);
473
}
474
475
static void
476
-vu_block_initialize_config(BlockDriverState *bs,
477
+vu_blk_initialize_config(BlockDriverState *bs,
478
struct virtio_blk_config *config, uint32_t blk_size)
479
{
480
config->capacity = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
481
@@ -XXX,XX +XXX,XX @@ vu_block_initialize_config(BlockDriverState *bs,
482
config->max_write_zeroes_seg = 1;
483
}
484
485
-static VuBlockDev *vu_block_init(VuBlockDev *vu_block_device, Error **errp)
486
+static void vu_blk_exp_request_shutdown(BlockExport *exp)
487
{
488
+ VuBlkExport *vexp = container_of(exp, VuBlkExport, export);
489
490
- BlockBackend *blk;
491
- Error *local_error = NULL;
492
- const char *node_name = vu_block_device->node_name;
493
- bool writable = vu_block_device->writable;
494
- uint64_t perm = BLK_PERM_CONSISTENT_READ;
495
- int ret;
496
-
497
- AioContext *ctx;
498
-
499
- BlockDriverState *bs = bdrv_lookup_bs(node_name, node_name, &local_error);
500
-
501
- if (!bs) {
502
- error_propagate(errp, local_error);
503
- return NULL;
504
- }
505
-
506
- if (bdrv_is_read_only(bs)) {
507
- writable = false;
508
- }
509
-
510
- if (writable) {
511
- perm |= BLK_PERM_WRITE;
512
- }
513
-
514
- ctx = bdrv_get_aio_context(bs);
515
- aio_context_acquire(ctx);
516
- bdrv_invalidate_cache(bs, NULL);
517
- aio_context_release(ctx);
518
-
519
- /*
520
- * Don't allow resize while the vhost user server is running,
521
- * otherwise we don't care what happens with the node.
522
- */
523
- blk = blk_new(bdrv_get_aio_context(bs), perm,
524
- BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE_UNCHANGED |
525
- BLK_PERM_WRITE | BLK_PERM_GRAPH_MOD);
526
- ret = blk_insert_bs(blk, bs, errp);
527
-
528
- if (ret < 0) {
529
- goto fail;
530
- }
531
-
532
- blk_set_enable_write_cache(blk, false);
533
-
534
- blk_set_allow_aio_context_change(blk, true);
535
-
536
- vu_block_device->blkcfg.wce = 0;
537
- vu_block_device->backend = blk;
538
- if (!vu_block_device->blk_size) {
539
- vu_block_device->blk_size = BDRV_SECTOR_SIZE;
540
- }
541
- vu_block_device->blkcfg.blk_size = vu_block_device->blk_size;
542
- blk_set_guest_block_size(blk, vu_block_device->blk_size);
543
- vu_block_initialize_config(bs, &vu_block_device->blkcfg,
544
- vu_block_device->blk_size);
545
- return vu_block_device;
546
-
547
-fail:
548
- blk_unref(blk);
549
- return NULL;
550
-}
551
-
552
-static void vu_block_deinit(VuBlockDev *vu_block_device)
553
-{
554
- if (vu_block_device->backend) {
555
- blk_remove_aio_context_notifier(vu_block_device->backend, blk_aio_attached,
556
- blk_aio_detach, vu_block_device);
557
- }
558
-
559
- blk_unref(vu_block_device->backend);
560
-}
561
-
562
-static void vhost_user_blk_server_stop(VuBlockDev *vu_block_device)
563
-{
564
- vhost_user_server_stop(&vu_block_device->vu_server);
565
- vu_block_deinit(vu_block_device);
566
-}
567
-
568
-static void vhost_user_blk_server_start(VuBlockDev *vu_block_device,
569
- Error **errp)
570
-{
571
- AioContext *ctx;
572
- SocketAddress *addr = vu_block_device->addr;
573
-
574
- if (!vu_block_init(vu_block_device, errp)) {
575
- return;
576
- }
577
-
578
- ctx = bdrv_get_aio_context(blk_bs(vu_block_device->backend));
579
-
580
- if (!vhost_user_server_start(&vu_block_device->vu_server, addr, ctx,
581
- VHOST_USER_BLK_MAX_QUEUES, &vu_block_iface,
582
- errp)) {
583
- goto error;
584
- }
585
-
586
- blk_add_aio_context_notifier(vu_block_device->backend, blk_aio_attached,
587
- blk_aio_detach, vu_block_device);
588
- vu_block_device->running = true;
589
- return;
590
-
591
- error:
592
- vu_block_deinit(vu_block_device);
593
-}
594
-
595
-static bool vu_prop_modifiable(VuBlockDev *vus, Error **errp)
596
-{
597
- if (vus->running) {
598
- error_setg(errp, "The property can't be modified "
599
- "while the server is running");
600
- return false;
601
- }
602
- return true;
603
-}
604
-
605
-static void vu_set_node_name(Object *obj, const char *value, Error **errp)
606
-{
607
- VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
608
-
609
- if (!vu_prop_modifiable(vus, errp)) {
610
- return;
611
- }
612
-
613
- if (vus->node_name) {
614
- g_free(vus->node_name);
615
- }
616
-
617
- vus->node_name = g_strdup(value);
618
-}
619
-
620
-static char *vu_get_node_name(Object *obj, Error **errp)
621
-{
622
- VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
623
- return g_strdup(vus->node_name);
624
-}
625
-
626
-static void free_socket_addr(SocketAddress *addr)
627
-{
628
- g_free(addr->u.q_unix.path);
629
- g_free(addr);
630
-}
631
-
632
-static void vu_set_unix_socket(Object *obj, const char *value,
633
- Error **errp)
634
-{
635
- VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
636
-
637
- if (!vu_prop_modifiable(vus, errp)) {
638
- return;
639
- }
640
-
641
- if (vus->addr) {
642
- free_socket_addr(vus->addr);
643
- }
644
-
645
- SocketAddress *addr = g_new0(SocketAddress, 1);
646
- addr->type = SOCKET_ADDRESS_TYPE_UNIX;
647
- addr->u.q_unix.path = g_strdup(value);
648
- vus->addr = addr;
649
+ vhost_user_server_stop(&vexp->vu_server);
650
}
651
652
-static char *vu_get_unix_socket(Object *obj, Error **errp)
653
+static int vu_blk_exp_create(BlockExport *exp, BlockExportOptions *opts,
654
+ Error **errp)
655
{
656
- VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
657
- return g_strdup(vus->addr->u.q_unix.path);
658
-}
659
-
660
-static bool vu_get_block_writable(Object *obj, Error **errp)
661
-{
662
- VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
663
- return vus->writable;
664
-}
665
-
666
-static void vu_set_block_writable(Object *obj, bool value, Error **errp)
667
-{
668
- VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
669
-
670
- if (!vu_prop_modifiable(vus, errp)) {
671
- return;
672
- }
673
-
674
- vus->writable = value;
675
-}
676
-
677
-static void vu_get_blk_size(Object *obj, Visitor *v, const char *name,
678
- void *opaque, Error **errp)
679
-{
680
- VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
681
- uint32_t value = vus->blk_size;
682
-
683
- visit_type_uint32(v, name, &value, errp);
684
-}
685
-
686
-static void vu_set_blk_size(Object *obj, Visitor *v, const char *name,
687
- void *opaque, Error **errp)
688
-{
689
- VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
690
-
691
+ VuBlkExport *vexp = container_of(exp, VuBlkExport, export);
692
+ BlockExportOptionsVhostUserBlk *vu_opts = &opts->u.vhost_user_blk;
693
Error *local_err = NULL;
694
- uint32_t value;
695
+ uint64_t logical_block_size;
696
697
- if (!vu_prop_modifiable(vus, errp)) {
698
- return;
699
- }
700
+ vexp->writable = opts->writable;
701
+ vexp->blkcfg.wce = 0;
702
703
- visit_type_uint32(v, name, &value, &local_err);
704
- if (local_err) {
705
- goto out;
706
+ if (vu_opts->has_logical_block_size) {
707
+ logical_block_size = vu_opts->logical_block_size;
708
+ } else {
709
+ logical_block_size = BDRV_SECTOR_SIZE;
710
}
711
-
712
- check_block_size(object_get_typename(obj), name, value, &local_err);
713
+ check_block_size(exp->id, "logical-block-size", logical_block_size,
714
+ &local_err);
715
if (local_err) {
716
- goto out;
124
+ error_propagate(errp, local_err);
717
+ error_propagate(errp, local_err);
125
+ return -EINVAL;
718
+ return -EINVAL;
126
+ }
719
+ }
720
+ vexp->blk_size = logical_block_size;
721
+ blk_set_guest_block_size(exp->blk, logical_block_size);
722
+ vu_blk_initialize_config(blk_bs(exp->blk), &vexp->blkcfg,
723
+ logical_block_size);
127
+
724
+
128
cryptoopts = qemu_opts_to_qdict_filtered(opts, NULL,
725
+ blk_set_allow_aio_context_change(exp->blk, true);
129
&block_crypto_create_opts_luks,
726
+ blk_add_aio_context_notifier(exp->blk, blk_aio_attached, blk_aio_detach,
130
true);
727
+ vexp);
131
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn block_crypto_co_create_opts_luks(const char *filename,
728
+
729
+ if (!vhost_user_server_start(&vexp->vu_server, vu_opts->addr, exp->ctx,
730
+ VHOST_USER_BLK_MAX_QUEUES, &vu_blk_iface,
731
+ errp)) {
732
+ blk_remove_aio_context_notifier(exp->blk, blk_aio_attached,
733
+ blk_aio_detach, vexp);
734
+ return -EADDRNOTAVAIL;
132
}
735
}
133
736
134
/* Create format layer */
737
- vus->blk_size = value;
135
- ret = block_crypto_co_create_generic(bs, size, create_opts, errp);
738
-
136
+ ret = block_crypto_co_create_generic(bs, size, create_opts, prealloc, errp);
739
-out:
137
if (ret < 0) {
740
- error_propagate(errp, local_err);
138
goto fail;
741
-}
139
}
742
-
743
-static void vhost_user_blk_server_instance_finalize(Object *obj)
744
-{
745
- VuBlockDev *vub = VHOST_USER_BLK_SERVER(obj);
746
-
747
- vhost_user_blk_server_stop(vub);
748
-
749
- /*
750
- * Unlike object_property_add_str, object_class_property_add_str
751
- * doesn't have a release method. Thus manual memory freeing is
752
- * needed.
753
- */
754
- free_socket_addr(vub->addr);
755
- g_free(vub->node_name);
756
-}
757
-
758
-static void vhost_user_blk_server_complete(UserCreatable *obj, Error **errp)
759
-{
760
- VuBlockDev *vub = VHOST_USER_BLK_SERVER(obj);
761
-
762
- vhost_user_blk_server_start(vub, errp);
763
+ return 0;
764
}
765
766
-static void vhost_user_blk_server_class_init(ObjectClass *klass,
767
- void *class_data)
768
+static void vu_blk_exp_delete(BlockExport *exp)
769
{
770
- UserCreatableClass *ucc = USER_CREATABLE_CLASS(klass);
771
- ucc->complete = vhost_user_blk_server_complete;
772
-
773
- object_class_property_add_bool(klass, "writable",
774
- vu_get_block_writable,
775
- vu_set_block_writable);
776
-
777
- object_class_property_add_str(klass, "node-name",
778
- vu_get_node_name,
779
- vu_set_node_name);
780
-
781
- object_class_property_add_str(klass, "unix-socket",
782
- vu_get_unix_socket,
783
- vu_set_unix_socket);
784
+ VuBlkExport *vexp = container_of(exp, VuBlkExport, export);
785
786
- object_class_property_add(klass, "logical-block-size", "uint32",
787
- vu_get_blk_size, vu_set_blk_size,
788
- NULL, NULL);
789
+ blk_remove_aio_context_notifier(exp->blk, blk_aio_attached, blk_aio_detach,
790
+ vexp);
791
}
792
793
-static const TypeInfo vhost_user_blk_server_info = {
794
- .name = TYPE_VHOST_USER_BLK_SERVER,
795
- .parent = TYPE_OBJECT,
796
- .instance_size = sizeof(VuBlockDev),
797
- .instance_finalize = vhost_user_blk_server_instance_finalize,
798
- .class_init = vhost_user_blk_server_class_init,
799
- .interfaces = (InterfaceInfo[]) {
800
- {TYPE_USER_CREATABLE},
801
- {}
802
- },
803
+const BlockExportDriver blk_exp_vhost_user_blk = {
804
+ .type = BLOCK_EXPORT_TYPE_VHOST_USER_BLK,
805
+ .instance_size = sizeof(VuBlkExport),
806
+ .create = vu_blk_exp_create,
807
+ .delete = vu_blk_exp_delete,
808
+ .request_shutdown = vu_blk_exp_request_shutdown,
809
};
810
-
811
-static void vhost_user_blk_server_register_types(void)
812
-{
813
- type_register_static(&vhost_user_blk_server_info);
814
-}
815
-
816
-type_init(vhost_user_blk_server_register_types)
817
diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
818
index XXXXXXX..XXXXXXX 100644
819
--- a/util/vhost-user-server.c
820
+++ b/util/vhost-user-server.c
821
@@ -XXX,XX +XXX,XX @@ bool vhost_user_server_start(VuServer *server,
822
Error **errp)
823
{
824
QEMUBH *bh;
825
- QIONetListener *listener = qio_net_listener_new();
826
+ QIONetListener *listener;
827
+
828
+ if (socket_addr->type != SOCKET_ADDRESS_TYPE_UNIX &&
829
+ socket_addr->type != SOCKET_ADDRESS_TYPE_FD) {
830
+ error_setg(errp, "Only socket address types 'unix' and 'fd' are supported");
831
+ return false;
832
+ }
833
+
834
+ listener = qio_net_listener_new();
835
if (qio_net_listener_open_sync(listener, socket_addr, 1,
836
errp) < 0) {
837
object_unref(OBJECT(listener));
838
diff --git a/block/export/meson.build b/block/export/meson.build
839
index XXXXXXX..XXXXXXX 100644
840
--- a/block/export/meson.build
841
+++ b/block/export/meson.build
842
@@ -1 +1,2 @@
843
block_ss.add(files('export.c'))
844
+block_ss.add(when: 'CONFIG_LINUX', if_true: files('vhost-user-blk-server.c', '../../contrib/libvhost-user/libvhost-user.c'))
845
diff --git a/block/meson.build b/block/meson.build
846
index XXXXXXX..XXXXXXX 100644
847
--- a/block/meson.build
848
+++ b/block/meson.build
849
@@ -XXX,XX +XXX,XX @@ block_ss.add(when: 'CONFIG_WIN32', if_true: files('file-win32.c', 'win32-aio.c')
850
block_ss.add(when: 'CONFIG_POSIX', if_true: [files('file-posix.c'), coref, iokit])
851
block_ss.add(when: 'CONFIG_LIBISCSI', if_true: files('iscsi-opts.c'))
852
block_ss.add(when: 'CONFIG_LINUX', if_true: files('nvme.c'))
853
-block_ss.add(when: 'CONFIG_LINUX', if_true: files('export/vhost-user-blk-server.c', '../contrib/libvhost-user/libvhost-user.c'))
854
block_ss.add(when: 'CONFIG_REPLICATION', if_true: files('replication.c'))
855
block_ss.add(when: 'CONFIG_SHEEPDOG', if_true: files('sheepdog.c'))
856
block_ss.add(when: ['CONFIG_LINUX_AIO', libaio], if_true: files('linux-aio.c'))
140
--
857
--
141
2.21.0
858
2.26.2
142
859
143
diff view generated by jsdifflib
New patch
1
Headers used by other subsystems are located in include/. Also add the
2
vhost-user-server and vhost-user-blk-server headers to MAINTAINERS.
1
3
4
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
5
Message-id: 20200924151549.913737-13-stefanha@redhat.com
6
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
7
---
8
MAINTAINERS | 4 +++-
9
{util => include/qemu}/vhost-user-server.h | 0
10
block/export/vhost-user-blk-server.c | 2 +-
11
util/vhost-user-server.c | 2 +-
12
4 files changed, 5 insertions(+), 3 deletions(-)
13
rename {util => include/qemu}/vhost-user-server.h (100%)
14
15
diff --git a/MAINTAINERS b/MAINTAINERS
16
index XXXXXXX..XXXXXXX 100644
17
--- a/MAINTAINERS
18
+++ b/MAINTAINERS
19
@@ -XXX,XX +XXX,XX @@ Vhost-user block device backend server
20
M: Coiby Xu <Coiby.Xu@gmail.com>
21
S: Maintained
22
F: block/export/vhost-user-blk-server.c
23
-F: util/vhost-user-server.c
24
+F: block/export/vhost-user-blk-server.h
25
+F: include/qemu/vhost-user-server.h
26
F: tests/qtest/libqos/vhost-user-blk.c
27
+F: util/vhost-user-server.c
28
29
Replication
30
M: Wen Congyang <wencongyang2@huawei.com>
31
diff --git a/util/vhost-user-server.h b/include/qemu/vhost-user-server.h
32
similarity index 100%
33
rename from util/vhost-user-server.h
34
rename to include/qemu/vhost-user-server.h
35
diff --git a/block/export/vhost-user-blk-server.c b/block/export/vhost-user-blk-server.c
36
index XXXXXXX..XXXXXXX 100644
37
--- a/block/export/vhost-user-blk-server.c
38
+++ b/block/export/vhost-user-blk-server.c
39
@@ -XXX,XX +XXX,XX @@
40
#include "block/block.h"
41
#include "contrib/libvhost-user/libvhost-user.h"
42
#include "standard-headers/linux/virtio_blk.h"
43
-#include "util/vhost-user-server.h"
44
+#include "qemu/vhost-user-server.h"
45
#include "vhost-user-blk-server.h"
46
#include "qapi/error.h"
47
#include "qom/object_interfaces.h"
48
diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
49
index XXXXXXX..XXXXXXX 100644
50
--- a/util/vhost-user-server.c
51
+++ b/util/vhost-user-server.c
52
@@ -XXX,XX +XXX,XX @@
53
*/
54
#include "qemu/osdep.h"
55
#include "qemu/main-loop.h"
56
+#include "qemu/vhost-user-server.h"
57
#include "block/aio-wait.h"
58
-#include "vhost-user-server.h"
59
60
/*
61
* Theory of operation:
62
--
63
2.26.2
64
diff view generated by jsdifflib
1
Fixes: 69f47505ee66afaa513305de0c1895a224e52c45
1
Don't compile contrib/libvhost-user/libvhost-user.c again. Instead build
2
Signed-off-by: Max Reitz <mreitz@redhat.com>
2
the static library once and then reuse it throughout QEMU.
3
Message-id: 20190725155512.9827-3-mreitz@redhat.com
3
4
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
4
Also switch from CONFIG_LINUX to CONFIG_VHOST_USER, which is what the
5
Reviewed-by: John Snow <jsnow@redhat.com>
5
vhost-user tools (vhost-user-gpu, etc) do.
6
Signed-off-by: Max Reitz <mreitz@redhat.com>
6
7
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
8
Message-id: 20200924151549.913737-14-stefanha@redhat.com
9
[Added CONFIG_LINUX again because libvhost-user doesn't build on macOS.
10
--Stefan]
11
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
7
---
12
---
8
block/vmdk.c | 3 +++
13
block/export/export.c | 8 ++++----
9
1 file changed, 3 insertions(+)
14
block/export/meson.build | 2 +-
15
contrib/libvhost-user/meson.build | 1 +
16
meson.build | 6 +++++-
17
util/meson.build | 4 +++-
18
5 files changed, 14 insertions(+), 7 deletions(-)
10
19
11
diff --git a/block/vmdk.c b/block/vmdk.c
20
diff --git a/block/export/export.c b/block/export/export.c
12
index XXXXXXX..XXXXXXX 100644
21
index XXXXXXX..XXXXXXX 100644
13
--- a/block/vmdk.c
22
--- a/block/export/export.c
14
+++ b/block/vmdk.c
23
+++ b/block/export/export.c
15
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn vmdk_co_block_status(BlockDriverState *bs,
24
@@ -XXX,XX +XXX,XX @@
16
if (!extent->compressed) {
25
#include "sysemu/block-backend.h"
17
ret |= BDRV_BLOCK_OFFSET_VALID;
26
#include "block/export.h"
18
*map = cluster_offset + index_in_cluster;
27
#include "block/nbd.h"
19
+ if (extent->flat) {
28
-#if CONFIG_LINUX
20
+ ret |= BDRV_BLOCK_RECURSE;
29
-#include "block/export/vhost-user-blk-server.h"
21
+ }
30
-#endif
22
}
31
#include "qapi/error.h"
23
*file = extent->file->bs;
32
#include "qapi/qapi-commands-block-export.h"
24
break;
33
#include "qapi/qapi-events-block-export.h"
34
#include "qemu/id.h"
35
+#ifdef CONFIG_VHOST_USER
36
+#include "vhost-user-blk-server.h"
37
+#endif
38
39
static const BlockExportDriver *blk_exp_drivers[] = {
40
&blk_exp_nbd,
41
-#if CONFIG_LINUX
42
+#ifdef CONFIG_VHOST_USER
43
&blk_exp_vhost_user_blk,
44
#endif
45
};
46
diff --git a/block/export/meson.build b/block/export/meson.build
47
index XXXXXXX..XXXXXXX 100644
48
--- a/block/export/meson.build
49
+++ b/block/export/meson.build
50
@@ -XXX,XX +XXX,XX @@
51
block_ss.add(files('export.c'))
52
-block_ss.add(when: 'CONFIG_LINUX', if_true: files('vhost-user-blk-server.c', '../../contrib/libvhost-user/libvhost-user.c'))
53
+block_ss.add(when: ['CONFIG_LINUX', 'CONFIG_VHOST_USER'], if_true: files('vhost-user-blk-server.c'))
54
diff --git a/contrib/libvhost-user/meson.build b/contrib/libvhost-user/meson.build
55
index XXXXXXX..XXXXXXX 100644
56
--- a/contrib/libvhost-user/meson.build
57
+++ b/contrib/libvhost-user/meson.build
58
@@ -XXX,XX +XXX,XX @@
59
libvhost_user = static_library('vhost-user',
60
files('libvhost-user.c', 'libvhost-user-glib.c'),
61
build_by_default: false)
62
+vhost_user = declare_dependency(link_with: libvhost_user)
63
diff --git a/meson.build b/meson.build
64
index XXXXXXX..XXXXXXX 100644
65
--- a/meson.build
66
+++ b/meson.build
67
@@ -XXX,XX +XXX,XX @@ trace_events_subdirs += [
68
'util',
69
]
70
71
+vhost_user = not_found
72
+if 'CONFIG_VHOST_USER' in config_host
73
+ subdir('contrib/libvhost-user')
74
+endif
75
+
76
subdir('qapi')
77
subdir('qobject')
78
subdir('stubs')
79
@@ -XXX,XX +XXX,XX @@ if have_tools
80
install: true)
81
82
if 'CONFIG_VHOST_USER' in config_host
83
- subdir('contrib/libvhost-user')
84
subdir('contrib/vhost-user-blk')
85
subdir('contrib/vhost-user-gpu')
86
subdir('contrib/vhost-user-input')
87
diff --git a/util/meson.build b/util/meson.build
88
index XXXXXXX..XXXXXXX 100644
89
--- a/util/meson.build
90
+++ b/util/meson.build
91
@@ -XXX,XX +XXX,XX @@ if have_block
92
util_ss.add(files('main-loop.c'))
93
util_ss.add(files('nvdimm-utils.c'))
94
util_ss.add(files('qemu-coroutine.c', 'qemu-coroutine-lock.c', 'qemu-coroutine-io.c'))
95
- util_ss.add(when: 'CONFIG_LINUX', if_true: files('vhost-user-server.c'))
96
+ util_ss.add(when: ['CONFIG_LINUX', 'CONFIG_VHOST_USER'], if_true: [
97
+ files('vhost-user-server.c'), vhost_user
98
+ ])
99
util_ss.add(files('block-helpers.c'))
100
util_ss.add(files('qemu-coroutine-sleep.c'))
101
util_ss.add(files('qemu-co-shared-resource.c'))
25
--
102
--
26
2.21.0
103
2.26.2
27
104
28
diff view generated by jsdifflib
1
The result of a sync=full mirror should always be the equal to the
1
Introduce libblkdev.fa to avoid recompiling blockdev_ss twice.
2
input. Therefore, existing images should be treated as potentially
3
non-zero and thus should be explicitly initialized to be zero
4
beforehand.
5
2
6
Signed-off-by: Max Reitz <mreitz@redhat.com>
3
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
7
Message-id: 20190724171239.8764-12-mreitz@redhat.com
4
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
8
Signed-off-by: Max Reitz <mreitz@redhat.com>
5
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
6
Message-id: 20200929125516.186715-3-stefanha@redhat.com
7
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
9
---
8
---
10
tests/qemu-iotests/041 | 62 +++++++++++++++++++++++++++++++++++---
9
meson.build | 12 ++++++++++--
11
tests/qemu-iotests/041.out | 4 +--
10
storage-daemon/meson.build | 3 +--
12
2 files changed, 60 insertions(+), 6 deletions(-)
11
2 files changed, 11 insertions(+), 4 deletions(-)
13
12
14
diff --git a/tests/qemu-iotests/041 b/tests/qemu-iotests/041
13
diff --git a/meson.build b/meson.build
15
index XXXXXXX..XXXXXXX 100755
14
index XXXXXXX..XXXXXXX 100644
16
--- a/tests/qemu-iotests/041
15
--- a/meson.build
17
+++ b/tests/qemu-iotests/041
16
+++ b/meson.build
18
@@ -XXX,XX +XXX,XX @@ class TestUnbackedSource(iotests.QMPTestCase):
17
@@ -XXX,XX +XXX,XX @@ blockdev_ss.add(files(
19
def setUp(self):
18
# os-win32.c does not
20
qemu_img('create', '-f', iotests.imgfmt, test_img,
19
blockdev_ss.add(when: 'CONFIG_POSIX', if_true: files('os-posix.c'))
21
str(TestUnbackedSource.image_len))
20
softmmu_ss.add(when: 'CONFIG_WIN32', if_true: [files('os-win32.c')])
22
- self.vm = iotests.VM().add_drive(test_img)
21
-softmmu_ss.add_all(blockdev_ss)
23
+ self.vm = iotests.VM()
22
24
self.vm.launch()
23
common_ss.add(files('cpus-common.c'))
25
+ result = self.vm.qmp('blockdev-add', node_name='drive0',
24
26
+ driver=iotests.imgfmt,
25
@@ -XXX,XX +XXX,XX @@ block = declare_dependency(link_whole: [libblock],
27
+ file={
26
link_args: '@block.syms',
28
+ 'driver': 'file',
27
dependencies: [crypto, io])
29
+ 'filename': test_img,
28
30
+ })
29
+blockdev_ss = blockdev_ss.apply(config_host, strict: false)
31
+ self.assert_qmp(result, 'return', {})
30
+libblockdev = static_library('blockdev', blockdev_ss.sources() + genh,
32
31
+ dependencies: blockdev_ss.dependencies(),
33
def tearDown(self):
32
+ name_suffix: 'fa',
34
self.vm.shutdown()
33
+ build_by_default: false)
35
@@ -XXX,XX +XXX,XX @@ class TestUnbackedSource(iotests.QMPTestCase):
36
37
def test_absolute_paths_full(self):
38
self.assert_no_active_block_jobs()
39
- result = self.vm.qmp('drive-mirror', device='drive0',
40
+ result = self.vm.qmp('drive-mirror', job_id='drive0', device='drive0',
41
sync='full', target=target_img,
42
mode='absolute-paths')
43
self.assert_qmp(result, 'return', {})
44
@@ -XXX,XX +XXX,XX @@ class TestUnbackedSource(iotests.QMPTestCase):
45
46
def test_absolute_paths_top(self):
47
self.assert_no_active_block_jobs()
48
- result = self.vm.qmp('drive-mirror', device='drive0',
49
+ result = self.vm.qmp('drive-mirror', job_id='drive0', device='drive0',
50
sync='top', target=target_img,
51
mode='absolute-paths')
52
self.assert_qmp(result, 'return', {})
53
@@ -XXX,XX +XXX,XX @@ class TestUnbackedSource(iotests.QMPTestCase):
54
55
def test_absolute_paths_none(self):
56
self.assert_no_active_block_jobs()
57
- result = self.vm.qmp('drive-mirror', device='drive0',
58
+ result = self.vm.qmp('drive-mirror', job_id='drive0', device='drive0',
59
sync='none', target=target_img,
60
mode='absolute-paths')
61
self.assert_qmp(result, 'return', {})
62
self.complete_and_wait()
63
self.assert_no_active_block_jobs()
64
65
+ def test_existing_full(self):
66
+ qemu_img('create', '-f', iotests.imgfmt, target_img,
67
+ str(self.image_len))
68
+ qemu_io('-c', 'write -P 42 0 64k', target_img)
69
+
34
+
70
+ self.assert_no_active_block_jobs()
35
+blockdev = declare_dependency(link_whole: [libblockdev],
71
+ result = self.vm.qmp('drive-mirror', job_id='drive0', device='drive0',
36
+ dependencies: [block])
72
+ sync='full', target=target_img, mode='existing')
73
+ self.assert_qmp(result, 'return', {})
74
+ self.complete_and_wait()
75
+ self.assert_no_active_block_jobs()
76
+
37
+
77
+ result = self.vm.qmp('blockdev-del', node_name='drive0')
38
qmp_ss = qmp_ss.apply(config_host, strict: false)
78
+ self.assert_qmp(result, 'return', {})
39
libqmp = static_library('qmp', qmp_ss.sources() + genh,
79
+
40
dependencies: qmp_ss.dependencies(),
80
+ self.assertTrue(iotests.compare_images(test_img, target_img),
41
@@ -XXX,XX +XXX,XX @@ foreach m : block_mods + softmmu_mods
81
+ 'target image does not match source after mirroring')
42
install_dir: config_host['qemu_moddir'])
82
+
43
endforeach
83
+ def test_blockdev_full(self):
44
84
+ qemu_img('create', '-f', iotests.imgfmt, target_img,
45
-softmmu_ss.add(authz, block, chardev, crypto, io, qmp)
85
+ str(self.image_len))
46
+softmmu_ss.add(authz, blockdev, chardev, crypto, io, qmp)
86
+ qemu_io('-c', 'write -P 42 0 64k', target_img)
47
common_ss.add(qom, qemuutil)
87
+
48
88
+ result = self.vm.qmp('blockdev-add', node_name='target',
49
common_ss.add_all(when: 'CONFIG_SOFTMMU', if_true: [softmmu_ss])
89
+ driver=iotests.imgfmt,
50
diff --git a/storage-daemon/meson.build b/storage-daemon/meson.build
90
+ file={
91
+ 'driver': 'file',
92
+ 'filename': target_img,
93
+ })
94
+ self.assert_qmp(result, 'return', {})
95
+
96
+ self.assert_no_active_block_jobs()
97
+ result = self.vm.qmp('blockdev-mirror', job_id='drive0', device='drive0',
98
+ sync='full', target='target')
99
+ self.assert_qmp(result, 'return', {})
100
+ self.complete_and_wait()
101
+ self.assert_no_active_block_jobs()
102
+
103
+ result = self.vm.qmp('blockdev-del', node_name='drive0')
104
+ self.assert_qmp(result, 'return', {})
105
+
106
+ result = self.vm.qmp('blockdev-del', node_name='target')
107
+ self.assert_qmp(result, 'return', {})
108
+
109
+ self.assertTrue(iotests.compare_images(test_img, target_img),
110
+ 'target image does not match source after mirroring')
111
+
112
class TestGranularity(iotests.QMPTestCase):
113
image_len = 10 * 1024 * 1024 # MB
114
115
diff --git a/tests/qemu-iotests/041.out b/tests/qemu-iotests/041.out
116
index XXXXXXX..XXXXXXX 100644
51
index XXXXXXX..XXXXXXX 100644
117
--- a/tests/qemu-iotests/041.out
52
--- a/storage-daemon/meson.build
118
+++ b/tests/qemu-iotests/041.out
53
+++ b/storage-daemon/meson.build
119
@@ -XXX,XX +XXX,XX @@
54
@@ -XXX,XX +XXX,XX @@
120
-........................................................................................
55
qsd_ss = ss.source_set()
121
+..........................................................................................
56
qsd_ss.add(files('qemu-storage-daemon.c'))
122
----------------------------------------------------------------------
57
-qsd_ss.add(block, chardev, qmp, qom, qemuutil)
123
-Ran 88 tests
58
-qsd_ss.add_all(blockdev_ss)
124
+Ran 90 tests
59
+qsd_ss.add(blockdev, chardev, qmp, qom, qemuutil)
125
60
126
OK
61
subdir('qapi')
62
127
--
63
--
128
2.21.0
64
2.26.2
129
65
130
diff view generated by jsdifflib
1
Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
1
Block exports are used by softmmu, qemu-storage-daemon, and qemu-nbd.
2
Fixes: 69f47505ee66afaa513305de0c1895a224e52c45
2
They are not used by other programs and are not otherwise needed in
3
Signed-off-by: Max Reitz <mreitz@redhat.com>
3
libblock.
4
Message-id: 20190725155512.9827-2-mreitz@redhat.com
4
5
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
5
Undo the recent move of blockdev-nbd.c from blockdev_ss into block_ss.
6
Reviewed-by: John Snow <jsnow@redhat.com>
6
Since bdrv_close_all() (libblock) calls blk_exp_close_all()
7
Signed-off-by: Max Reitz <mreitz@redhat.com>
7
(libblockdev) a stub function is required..
8
9
Make qemu-nbd.c use signal handling utility functions instead of
10
duplicating the code. This helps because os-posix.c is in libblockdev
11
and it depends on a qemu_system_killed() symbol that qemu-nbd.c lacks.
12
Once we use the signal handling utility functions we also end up
13
providing the necessary symbol.
14
15
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
16
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
17
Reviewed-by: Eric Blake <eblake@redhat.com>
18
Message-id: 20200929125516.186715-4-stefanha@redhat.com
19
[Fixed s/ndb/nbd/ typo in commit description as suggested by Eric Blake
20
--Stefan]
21
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
8
---
22
---
9
block/vdi.c | 3 ++-
23
qemu-nbd.c | 21 ++++++++-------------
10
1 file changed, 2 insertions(+), 1 deletion(-)
24
stubs/blk-exp-close-all.c | 7 +++++++
25
block/export/meson.build | 4 ++--
26
meson.build | 4 ++--
27
nbd/meson.build | 2 ++
28
stubs/meson.build | 1 +
29
6 files changed, 22 insertions(+), 17 deletions(-)
30
create mode 100644 stubs/blk-exp-close-all.c
11
31
12
diff --git a/block/vdi.c b/block/vdi.c
32
diff --git a/qemu-nbd.c b/qemu-nbd.c
13
index XXXXXXX..XXXXXXX 100644
33
index XXXXXXX..XXXXXXX 100644
14
--- a/block/vdi.c
34
--- a/qemu-nbd.c
15
+++ b/block/vdi.c
35
+++ b/qemu-nbd.c
16
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn vdi_co_block_status(BlockDriverState *bs,
36
@@ -XXX,XX +XXX,XX @@
17
*map = s->header.offset_data + (uint64_t)bmap_entry * s->block_size +
37
#include "qapi/error.h"
18
index_in_block;
38
#include "qemu/cutils.h"
19
*file = bs->file->bs;
39
#include "sysemu/block-backend.h"
20
- return BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID;
40
+#include "sysemu/runstate.h" /* for qemu_system_killed() prototype */
21
+ return BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID |
41
#include "block/block_int.h"
22
+ (s->header.image_type == VDI_TYPE_STATIC ? BDRV_BLOCK_RECURSE : 0);
42
#include "block/nbd.h"
43
#include "qemu/main-loop.h"
44
@@ -XXX,XX +XXX,XX @@ QEMU_COPYRIGHT "\n"
23
}
45
}
24
46
25
static int coroutine_fn
47
#ifdef CONFIG_POSIX
48
-static void termsig_handler(int signum)
49
+/*
50
+ * The client thread uses SIGTERM to interrupt the server. A signal
51
+ * handler ensures that "qemu-nbd -v -c" exits with a nice status code.
52
+ */
53
+void qemu_system_killed(int signum, pid_t pid)
54
{
55
qatomic_cmpxchg(&state, RUNNING, TERMINATE);
56
qemu_notify_event();
57
@@ -XXX,XX +XXX,XX @@ int main(int argc, char **argv)
58
BlockExportOptions *export_opts;
59
60
#ifdef CONFIG_POSIX
61
- /*
62
- * Exit gracefully on various signals, which includes SIGTERM used
63
- * by 'qemu-nbd -v -c'.
64
- */
65
- struct sigaction sa_sigterm;
66
- memset(&sa_sigterm, 0, sizeof(sa_sigterm));
67
- sa_sigterm.sa_handler = termsig_handler;
68
- sigaction(SIGTERM, &sa_sigterm, NULL);
69
- sigaction(SIGINT, &sa_sigterm, NULL);
70
- sigaction(SIGHUP, &sa_sigterm, NULL);
71
-
72
- signal(SIGPIPE, SIG_IGN);
73
+ os_setup_early_signal_handling();
74
+ os_setup_signal_handling();
75
#endif
76
77
socket_init();
78
diff --git a/stubs/blk-exp-close-all.c b/stubs/blk-exp-close-all.c
79
new file mode 100644
80
index XXXXXXX..XXXXXXX
81
--- /dev/null
82
+++ b/stubs/blk-exp-close-all.c
83
@@ -XXX,XX +XXX,XX @@
84
+#include "qemu/osdep.h"
85
+#include "block/export.h"
86
+
87
+/* Only used in programs that support block exports (libblockdev.fa) */
88
+void blk_exp_close_all(void)
89
+{
90
+}
91
diff --git a/block/export/meson.build b/block/export/meson.build
92
index XXXXXXX..XXXXXXX 100644
93
--- a/block/export/meson.build
94
+++ b/block/export/meson.build
95
@@ -XXX,XX +XXX,XX @@
96
-block_ss.add(files('export.c'))
97
-block_ss.add(when: ['CONFIG_LINUX', 'CONFIG_VHOST_USER'], if_true: files('vhost-user-blk-server.c'))
98
+blockdev_ss.add(files('export.c'))
99
+blockdev_ss.add(when: ['CONFIG_LINUX', 'CONFIG_VHOST_USER'], if_true: files('vhost-user-blk-server.c'))
100
diff --git a/meson.build b/meson.build
101
index XXXXXXX..XXXXXXX 100644
102
--- a/meson.build
103
+++ b/meson.build
104
@@ -XXX,XX +XXX,XX @@ subdir('dump')
105
106
block_ss.add(files(
107
'block.c',
108
- 'blockdev-nbd.c',
109
'blockjob.c',
110
'job.c',
111
'qemu-io-cmds.c',
112
@@ -XXX,XX +XXX,XX @@ subdir('block')
113
114
blockdev_ss.add(files(
115
'blockdev.c',
116
+ 'blockdev-nbd.c',
117
'iothread.c',
118
'job-qmp.c',
119
))
120
@@ -XXX,XX +XXX,XX @@ if have_tools
121
qemu_io = executable('qemu-io', files('qemu-io.c'),
122
dependencies: [block, qemuutil], install: true)
123
qemu_nbd = executable('qemu-nbd', files('qemu-nbd.c'),
124
- dependencies: [block, qemuutil], install: true)
125
+ dependencies: [blockdev, qemuutil], install: true)
126
127
subdir('storage-daemon')
128
subdir('contrib/rdmacm-mux')
129
diff --git a/nbd/meson.build b/nbd/meson.build
130
index XXXXXXX..XXXXXXX 100644
131
--- a/nbd/meson.build
132
+++ b/nbd/meson.build
133
@@ -XXX,XX +XXX,XX @@
134
block_ss.add(files(
135
'client.c',
136
'common.c',
137
+))
138
+blockdev_ss.add(files(
139
'server.c',
140
))
141
diff --git a/stubs/meson.build b/stubs/meson.build
142
index XXXXXXX..XXXXXXX 100644
143
--- a/stubs/meson.build
144
+++ b/stubs/meson.build
145
@@ -XXX,XX +XXX,XX @@
146
stub_ss.add(files('arch_type.c'))
147
stub_ss.add(files('bdrv-next-monitor-owned.c'))
148
stub_ss.add(files('blk-commit-all.c'))
149
+stub_ss.add(files('blk-exp-close-all.c'))
150
stub_ss.add(files('blockdev-close-all-bdrv-states.c'))
151
stub_ss.add(files('change-state-handler.c'))
152
stub_ss.add(files('cmos.c'))
26
--
153
--
27
2.21.0
154
2.26.2
28
155
29
diff view generated by jsdifflib
1
bdrv_has_zero_init() only has meaning for newly created images or image
1
Make it possible to specify the iothread where the export will run. By
2
areas. If qemu-img convert did not create the image itself, it cannot
2
default the block node can be moved to other AioContexts later and the
3
rely on bdrv_has_zero_init()'s result to carry any meaning.
3
export will follow. The fixed-iothread option forces strict behavior
4
that prevents changing AioContext while the export is active. See the
5
QAPI docs for details.
4
6
5
Signed-off-by: Max Reitz <mreitz@redhat.com>
7
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
6
Message-id: 20190724171239.8764-2-mreitz@redhat.com
8
Message-id: 20200929125516.186715-5-stefanha@redhat.com
7
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
9
[Fix stray '#' character in block-export.json and add missing "(since:
8
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
10
5.2)" as suggested by Eric Blake.
9
Signed-off-by: Max Reitz <mreitz@redhat.com>
11
--Stefan]
12
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10
---
13
---
11
qemu-img.c | 11 ++++++++---
14
qapi/block-export.json | 11 ++++++++++
12
1 file changed, 8 insertions(+), 3 deletions(-)
15
block/export/export.c | 31 +++++++++++++++++++++++++++-
16
block/export/vhost-user-blk-server.c | 5 ++++-
17
nbd/server.c | 2 --
18
4 files changed, 45 insertions(+), 4 deletions(-)
13
19
14
diff --git a/qemu-img.c b/qemu-img.c
20
diff --git a/qapi/block-export.json b/qapi/block-export.json
15
index XXXXXXX..XXXXXXX 100644
21
index XXXXXXX..XXXXXXX 100644
16
--- a/qemu-img.c
22
--- a/qapi/block-export.json
17
+++ b/qemu-img.c
23
+++ b/qapi/block-export.json
18
@@ -XXX,XX +XXX,XX @@ typedef struct ImgConvertState {
24
@@ -XXX,XX +XXX,XX @@
19
bool has_zero_init;
25
# export before completion is signalled. (since: 5.2;
20
bool compressed;
26
# default: false)
21
bool unallocated_blocks_are_zero;
27
#
22
+ bool target_is_new;
28
+# @iothread: The name of the iothread object where the export will run. The
23
bool target_has_backing;
29
+# default is to use the thread currently associated with the
24
int64_t target_backing_sectors; /* negative if unknown */
30
+# block node. (since: 5.2)
25
bool wr_in_order;
31
+#
26
@@ -XXX,XX +XXX,XX @@ static int convert_do_copy(ImgConvertState *s)
32
+# @fixed-iothread: True prevents the block node from being moved to another
27
int64_t sector_num = 0;
33
+# thread while the export is active. If true and @iothread is
28
34
+# given, export creation fails if the block node cannot be
29
/* Check whether we have zero initialisation or can get it efficiently */
35
+# moved to the iothread. The default is false. (since: 5.2)
30
- s->has_zero_init = s->min_sparse && !s->target_has_backing
36
+#
31
- ? bdrv_has_zero_init(blk_bs(s->target))
37
# Since: 4.2
32
- : false;
38
##
33
+ if (s->target_is_new && s->min_sparse && !s->target_has_backing) {
39
{ 'union': 'BlockExportOptions',
34
+ s->has_zero_init = bdrv_has_zero_init(blk_bs(s->target));
40
'base': { 'type': 'BlockExportType',
35
+ } else {
41
'id': 'str',
36
+ s->has_zero_init = false;
42
+     '*fixed-iothread': 'bool',
43
+     '*iothread': 'str',
44
'node-name': 'str',
45
'*writable': 'bool',
46
'*writethrough': 'bool' },
47
diff --git a/block/export/export.c b/block/export/export.c
48
index XXXXXXX..XXXXXXX 100644
49
--- a/block/export/export.c
50
+++ b/block/export/export.c
51
@@ -XXX,XX +XXX,XX @@
52
53
#include "block/block.h"
54
#include "sysemu/block-backend.h"
55
+#include "sysemu/iothread.h"
56
#include "block/export.h"
57
#include "block/nbd.h"
58
#include "qapi/error.h"
59
@@ -XXX,XX +XXX,XX @@ static const BlockExportDriver *blk_exp_find_driver(BlockExportType type)
60
61
BlockExport *blk_exp_add(BlockExportOptions *export, Error **errp)
62
{
63
+ bool fixed_iothread = export->has_fixed_iothread && export->fixed_iothread;
64
const BlockExportDriver *drv;
65
BlockExport *exp = NULL;
66
BlockDriverState *bs;
67
- BlockBackend *blk;
68
+ BlockBackend *blk = NULL;
69
AioContext *ctx;
70
uint64_t perm;
71
int ret;
72
@@ -XXX,XX +XXX,XX @@ BlockExport *blk_exp_add(BlockExportOptions *export, Error **errp)
73
ctx = bdrv_get_aio_context(bs);
74
aio_context_acquire(ctx);
75
76
+ if (export->has_iothread) {
77
+ IOThread *iothread;
78
+ AioContext *new_ctx;
79
+
80
+ iothread = iothread_by_id(export->iothread);
81
+ if (!iothread) {
82
+ error_setg(errp, "iothread \"%s\" not found", export->iothread);
83
+ goto fail;
84
+ }
85
+
86
+ new_ctx = iothread_get_aio_context(iothread);
87
+
88
+ ret = bdrv_try_set_aio_context(bs, new_ctx, errp);
89
+ if (ret == 0) {
90
+ aio_context_release(ctx);
91
+ aio_context_acquire(new_ctx);
92
+ ctx = new_ctx;
93
+ } else if (fixed_iothread) {
94
+ goto fail;
95
+ }
37
+ }
96
+ }
38
97
+
39
if (!s->has_zero_init && !s->target_has_backing &&
98
/*
40
bdrv_can_write_zeroes_with_unmap(blk_bs(s->target)))
99
* Block exports are used for non-shared storage migration. Make sure
41
@@ -XXX,XX +XXX,XX @@ static int img_convert(int argc, char **argv)
100
* that BDRV_O_INACTIVE is cleared and the image is ready for write
42
}
101
@@ -XXX,XX +XXX,XX @@ BlockExport *blk_exp_add(BlockExportOptions *export, Error **errp)
43
}
102
}
44
103
45
+ s.target_is_new = !skip_create;
104
blk = blk_new(ctx, perm, BLK_PERM_ALL);
46
+
105
+
47
flags = s.min_sparse ? (BDRV_O_RDWR | BDRV_O_UNMAP) : BDRV_O_RDWR;
106
+ if (!fixed_iothread) {
48
ret = bdrv_parse_cache_mode(cache, &flags, &writethrough);
107
+ blk_set_allow_aio_context_change(blk, true);
108
+ }
109
+
110
ret = blk_insert_bs(blk, bs, errp);
49
if (ret < 0) {
111
if (ret < 0) {
112
goto fail;
113
diff --git a/block/export/vhost-user-blk-server.c b/block/export/vhost-user-blk-server.c
114
index XXXXXXX..XXXXXXX 100644
115
--- a/block/export/vhost-user-blk-server.c
116
+++ b/block/export/vhost-user-blk-server.c
117
@@ -XXX,XX +XXX,XX @@ static const VuDevIface vu_blk_iface = {
118
static void blk_aio_attached(AioContext *ctx, void *opaque)
119
{
120
VuBlkExport *vexp = opaque;
121
+
122
+ vexp->export.ctx = ctx;
123
vhost_user_server_attach_aio_context(&vexp->vu_server, ctx);
124
}
125
126
static void blk_aio_detach(void *opaque)
127
{
128
VuBlkExport *vexp = opaque;
129
+
130
vhost_user_server_detach_aio_context(&vexp->vu_server);
131
+ vexp->export.ctx = NULL;
132
}
133
134
static void
135
@@ -XXX,XX +XXX,XX @@ static int vu_blk_exp_create(BlockExport *exp, BlockExportOptions *opts,
136
vu_blk_initialize_config(blk_bs(exp->blk), &vexp->blkcfg,
137
logical_block_size);
138
139
- blk_set_allow_aio_context_change(exp->blk, true);
140
blk_add_aio_context_notifier(exp->blk, blk_aio_attached, blk_aio_detach,
141
vexp);
142
143
diff --git a/nbd/server.c b/nbd/server.c
144
index XXXXXXX..XXXXXXX 100644
145
--- a/nbd/server.c
146
+++ b/nbd/server.c
147
@@ -XXX,XX +XXX,XX @@ static int nbd_export_create(BlockExport *blk_exp, BlockExportOptions *exp_args,
148
return ret;
149
}
150
151
- blk_set_allow_aio_context_change(blk, true);
152
-
153
QTAILQ_INIT(&exp->clients);
154
exp->name = g_strdup(arg->name);
155
exp->description = g_strdup(arg->description);
50
--
156
--
51
2.21.0
157
2.26.2
52
158
53
diff view generated by jsdifflib
1
Signed-off-by: Max Reitz <mreitz@redhat.com>
1
Allow the number of queues to be configured using --export
2
Message-id: 20190724171239.8764-11-mreitz@redhat.com
2
vhost-user-blk,num-queues=N. This setting should match the QEMU --device
3
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
3
vhost-user-blk-pci,num-queues=N setting but QEMU vhost-user-blk.c lowers
4
Signed-off-by: Max Reitz <mreitz@redhat.com>
4
its own value if the vhost-user-blk backend offers fewer queues than
5
QEMU.
6
7
The vhost-user-blk-server.c code is already capable of multi-queue. All
8
virtqueue processing runs in the same AioContext. No new locking is
9
needed.
10
11
Add the num-queues=N option and set the VIRTIO_BLK_F_MQ feature bit.
12
Note that the feature bit only announces the presence of the num_queues
13
configuration space field. It does not promise that there is more than 1
14
virtqueue, so we can set it unconditionally.
15
16
I tested multi-queue by running a random read fio test with numjobs=4 on
17
an -smp 4 guest. After the benchmark finished the guest /proc/interrupts
18
file showed activity on all 4 virtio-blk MSI-X. The /sys/block/vda/mq/
19
directory shows that Linux blk-mq has 4 queues configured.
20
21
An automated test is included in the next commit.
22
23
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
24
Acked-by: Markus Armbruster <armbru@redhat.com>
25
Message-id: 20201001144604.559733-2-stefanha@redhat.com
26
[Fixed accidental tab characters as suggested by Markus Armbruster
27
--Stefan]
28
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
5
---
29
---
6
tests/qemu-iotests/122 | 17 +++++++++++++++++
30
qapi/block-export.json | 10 +++++++---
7
tests/qemu-iotests/122.out | 8 ++++++++
31
block/export/vhost-user-blk-server.c | 24 ++++++++++++++++++------
8
2 files changed, 25 insertions(+)
32
2 files changed, 25 insertions(+), 9 deletions(-)
9
33
10
diff --git a/tests/qemu-iotests/122 b/tests/qemu-iotests/122
34
diff --git a/qapi/block-export.json b/qapi/block-export.json
11
index XXXXXXX..XXXXXXX 100755
35
index XXXXXXX..XXXXXXX 100644
12
--- a/tests/qemu-iotests/122
36
--- a/qapi/block-export.json
13
+++ b/tests/qemu-iotests/122
37
+++ b/qapi/block-export.json
14
@@ -XXX,XX +XXX,XX @@ for min_sparse in 4k 8k; do
38
@@ -XXX,XX +XXX,XX @@
15
$QEMU_IMG map --output=json "$TEST_IMG".orig | _filter_qemu_img_map
39
# SocketAddress types are supported. Passed fds must be UNIX domain
16
done
40
# sockets.
17
41
# @logical-block-size: Logical block size in bytes. Defaults to 512 bytes.
42
+# @num-queues: Number of request virtqueues. Must be greater than 0. Defaults
43
+# to 1.
44
#
45
# Since: 5.2
46
##
47
{ 'struct': 'BlockExportOptionsVhostUserBlk',
48
- 'data': { 'addr': 'SocketAddress', '*logical-block-size': 'size' } }
49
+ 'data': { 'addr': 'SocketAddress',
50
+     '*logical-block-size': 'size',
51
+ '*num-queues': 'uint16'} }
52
53
##
54
# @NbdServerAddOptions:
55
@@ -XXX,XX +XXX,XX @@
56
{ 'union': 'BlockExportOptions',
57
'base': { 'type': 'BlockExportType',
58
'id': 'str',
59
-     '*fixed-iothread': 'bool',
60
-     '*iothread': 'str',
61
+ '*fixed-iothread': 'bool',
62
+ '*iothread': 'str',
63
'node-name': 'str',
64
'*writable': 'bool',
65
'*writethrough': 'bool' },
66
diff --git a/block/export/vhost-user-blk-server.c b/block/export/vhost-user-blk-server.c
67
index XXXXXXX..XXXXXXX 100644
68
--- a/block/export/vhost-user-blk-server.c
69
+++ b/block/export/vhost-user-blk-server.c
70
@@ -XXX,XX +XXX,XX @@
71
#include "util/block-helpers.h"
72
73
enum {
74
- VHOST_USER_BLK_MAX_QUEUES = 1,
75
+ VHOST_USER_BLK_NUM_QUEUES_DEFAULT = 1,
76
};
77
struct virtio_blk_inhdr {
78
unsigned char status;
79
@@ -XXX,XX +XXX,XX @@ static uint64_t vu_blk_get_features(VuDev *dev)
80
1ull << VIRTIO_BLK_F_DISCARD |
81
1ull << VIRTIO_BLK_F_WRITE_ZEROES |
82
1ull << VIRTIO_BLK_F_CONFIG_WCE |
83
+ 1ull << VIRTIO_BLK_F_MQ |
84
1ull << VIRTIO_F_VERSION_1 |
85
1ull << VIRTIO_RING_F_INDIRECT_DESC |
86
1ull << VIRTIO_RING_F_EVENT_IDX |
87
@@ -XXX,XX +XXX,XX @@ static void blk_aio_detach(void *opaque)
88
89
static void
90
vu_blk_initialize_config(BlockDriverState *bs,
91
- struct virtio_blk_config *config, uint32_t blk_size)
92
+ struct virtio_blk_config *config,
93
+ uint32_t blk_size,
94
+ uint16_t num_queues)
95
{
96
config->capacity = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
97
config->blk_size = blk_size;
98
@@ -XXX,XX +XXX,XX @@ vu_blk_initialize_config(BlockDriverState *bs,
99
config->seg_max = 128 - 2;
100
config->min_io_size = 1;
101
config->opt_io_size = 1;
102
- config->num_queues = VHOST_USER_BLK_MAX_QUEUES;
103
+ config->num_queues = num_queues;
104
config->max_discard_sectors = 32768;
105
config->max_discard_seg = 1;
106
config->discard_sector_alignment = config->blk_size >> 9;
107
@@ -XXX,XX +XXX,XX @@ static int vu_blk_exp_create(BlockExport *exp, BlockExportOptions *opts,
108
BlockExportOptionsVhostUserBlk *vu_opts = &opts->u.vhost_user_blk;
109
Error *local_err = NULL;
110
uint64_t logical_block_size;
111
+ uint16_t num_queues = VHOST_USER_BLK_NUM_QUEUES_DEFAULT;
112
113
vexp->writable = opts->writable;
114
vexp->blkcfg.wce = 0;
115
@@ -XXX,XX +XXX,XX @@ static int vu_blk_exp_create(BlockExport *exp, BlockExportOptions *opts,
116
}
117
vexp->blk_size = logical_block_size;
118
blk_set_guest_block_size(exp->blk, logical_block_size);
18
+
119
+
19
+echo
120
+ if (vu_opts->has_num_queues) {
20
+echo '=== -n to a non-zero image ==='
121
+ num_queues = vu_opts->num_queues;
21
+echo
122
+ }
123
+ if (num_queues == 0) {
124
+ error_setg(errp, "num-queues must be greater than 0");
125
+ return -EINVAL;
126
+ }
22
+
127
+
23
+# Keep source zero
128
vu_blk_initialize_config(blk_bs(exp->blk), &vexp->blkcfg,
24
+_make_test_img 64M
129
- logical_block_size);
25
+
130
+ logical_block_size, num_queues);
26
+# Output is not zero, but has bdrv_has_zero_init() == 1
131
27
+TEST_IMG="$TEST_IMG".orig _make_test_img 64M
132
blk_add_aio_context_notifier(exp->blk, blk_aio_attached, blk_aio_detach,
28
+$QEMU_IO -c "write -P 42 0 64k" "$TEST_IMG".orig | _filter_qemu_io
133
vexp);
29
+
134
30
+# Convert with -n, which should not assume that the target is zeroed
135
if (!vhost_user_server_start(&vexp->vu_server, vu_opts->addr, exp->ctx,
31
+$QEMU_IMG convert -O $IMGFMT -n "$TEST_IMG" "$TEST_IMG".orig
136
- VHOST_USER_BLK_MAX_QUEUES, &vu_blk_iface,
32
+
137
- errp)) {
33
+$QEMU_IMG compare "$TEST_IMG" "$TEST_IMG".orig
138
+ num_queues, &vu_blk_iface, errp)) {
34
+
139
blk_remove_aio_context_notifier(exp->blk, blk_aio_attached,
35
# success, all done
140
blk_aio_detach, vexp);
36
echo '*** done'
141
return -EADDRNOTAVAIL;
37
rm -f $seq.full
38
diff --git a/tests/qemu-iotests/122.out b/tests/qemu-iotests/122.out
39
index XXXXXXX..XXXXXXX 100644
40
--- a/tests/qemu-iotests/122.out
41
+++ b/tests/qemu-iotests/122.out
42
@@ -XXX,XX +XXX,XX @@ convert -c -S 8k
43
{ "start": 9216, "length": 8192, "depth": 0, "zero": true, "data": false},
44
{ "start": 17408, "length": 1024, "depth": 0, "zero": false, "data": true},
45
{ "start": 18432, "length": 67090432, "depth": 0, "zero": true, "data": false}]
46
+
47
+=== -n to a non-zero image ===
48
+
49
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864
50
+Formatting 'TEST_DIR/t.IMGFMT.orig', fmt=IMGFMT size=67108864
51
+wrote 65536/65536 bytes at offset 0
52
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
53
+Images are identical.
54
*** done
55
--
142
--
56
2.21.0
143
2.26.2
57
144
58
diff view generated by jsdifflib
1
We need to implement .bdrv_has_zero_init_truncate() for every block
1
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2
driver that supports truncation and has a .bdrv_has_zero_init()
3
implementation.
4
2
5
Implement it the same way each driver implements .bdrv_has_zero_init().
3
bdrv_co_block_status_above has several design problems with handling
6
This is at least not any more unsafe than what we had before.
4
short backing files:
7
5
8
Signed-off-by: Max Reitz <mreitz@redhat.com>
6
1. With want_zeros=true, it may return ret with BDRV_BLOCK_ZERO but
9
Message-id: 20190724171239.8764-5-mreitz@redhat.com
7
without BDRV_BLOCK_ALLOCATED flag, when actually short backing file
10
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
8
which produces these after-EOF zeros is inside requested backing
11
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
9
sequence.
12
Signed-off-by: Max Reitz <mreitz@redhat.com>
10
11
2. With want_zero=false, it may return pnum=0 prior to actual EOF,
12
because of EOF of short backing file.
13
14
Fix these things, making logic about short backing files clearer.
15
16
With fixed bdrv_block_status_above we also have to improve is_zero in
17
qcow2 code, otherwise iotest 154 will fail, because with this patch we
18
stop to merge zeros of different types (produced by fully unallocated
19
in the whole backing chain regions vs produced by short backing files).
20
21
Note also, that this patch leaves for another day the general problem
22
around block-status: misuse of BDRV_BLOCK_ALLOCATED as is-fs-allocated
23
vs go-to-backing.
24
25
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
26
Reviewed-by: Alberto Garcia <berto@igalia.com>
27
Reviewed-by: Eric Blake <eblake@redhat.com>
28
Message-id: 20200924194003.22080-2-vsementsov@virtuozzo.com
29
[Fix s/comes/come/ as suggested by Eric Blake
30
--Stefan]
31
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
13
---
32
---
14
block/file-posix.c | 1 +
33
block/io.c | 68 ++++++++++++++++++++++++++++++++++++++++-----------
15
block/file-win32.c | 1 +
34
block/qcow2.c | 16 ++++++++++--
16
block/gluster.c | 4 ++++
35
2 files changed, 68 insertions(+), 16 deletions(-)
17
block/nfs.c | 1 +
18
block/qcow2.c | 1 +
19
block/qed.c | 1 +
20
block/raw-format.c | 6 ++++++
21
block/rbd.c | 1 +
22
block/sheepdog.c | 1 +
23
block/ssh.c | 1 +
24
10 files changed, 18 insertions(+)
25
36
26
diff --git a/block/file-posix.c b/block/file-posix.c
37
diff --git a/block/io.c b/block/io.c
27
index XXXXXXX..XXXXXXX 100644
38
index XXXXXXX..XXXXXXX 100644
28
--- a/block/file-posix.c
39
--- a/block/io.c
29
+++ b/block/file-posix.c
40
+++ b/block/io.c
30
@@ -XXX,XX +XXX,XX @@ BlockDriver bdrv_file = {
41
@@ -XXX,XX +XXX,XX @@ bdrv_co_common_block_status_above(BlockDriverState *bs,
31
.bdrv_co_create = raw_co_create,
42
int64_t *map,
32
.bdrv_co_create_opts = raw_co_create_opts,
43
BlockDriverState **file)
33
.bdrv_has_zero_init = bdrv_has_zero_init_1,
44
{
34
+ .bdrv_has_zero_init_truncate = bdrv_has_zero_init_1,
45
+ int ret;
35
.bdrv_co_block_status = raw_co_block_status,
46
BlockDriverState *p;
36
.bdrv_co_invalidate_cache = raw_co_invalidate_cache,
47
- int ret = 0;
37
.bdrv_co_pwrite_zeroes = raw_co_pwrite_zeroes,
48
- bool first = true;
38
diff --git a/block/file-win32.c b/block/file-win32.c
49
+ int64_t eof = 0;
39
index XXXXXXX..XXXXXXX 100644
50
40
--- a/block/file-win32.c
51
assert(bs != base);
41
+++ b/block/file-win32.c
52
- for (p = bs; p != base; p = bdrv_filter_or_cow_bs(p)) {
42
@@ -XXX,XX +XXX,XX @@ BlockDriver bdrv_file = {
53
+
43
.bdrv_close = raw_close,
54
+ ret = bdrv_co_block_status(bs, want_zero, offset, bytes, pnum, map, file);
44
.bdrv_co_create_opts = raw_co_create_opts,
55
+ if (ret < 0 || *pnum == 0 || ret & BDRV_BLOCK_ALLOCATED) {
45
.bdrv_has_zero_init = bdrv_has_zero_init_1,
56
+ return ret;
46
+ .bdrv_has_zero_init_truncate = bdrv_has_zero_init_1,
57
+ }
47
58
+
48
.bdrv_aio_preadv = raw_aio_preadv,
59
+ if (ret & BDRV_BLOCK_EOF) {
49
.bdrv_aio_pwritev = raw_aio_pwritev,
60
+ eof = offset + *pnum;
50
diff --git a/block/gluster.c b/block/gluster.c
61
+ }
51
index XXXXXXX..XXXXXXX 100644
62
+
52
--- a/block/gluster.c
63
+ assert(*pnum <= bytes);
53
+++ b/block/gluster.c
64
+ bytes = *pnum;
54
@@ -XXX,XX +XXX,XX @@ static BlockDriver bdrv_gluster = {
65
+
55
.bdrv_co_writev = qemu_gluster_co_writev,
66
+ for (p = bdrv_filter_or_cow_bs(bs); p != base;
56
.bdrv_co_flush_to_disk = qemu_gluster_co_flush_to_disk,
67
+ p = bdrv_filter_or_cow_bs(p))
57
.bdrv_has_zero_init = qemu_gluster_has_zero_init,
68
+ {
58
+ .bdrv_has_zero_init_truncate = qemu_gluster_has_zero_init,
69
ret = bdrv_co_block_status(p, want_zero, offset, bytes, pnum, map,
59
#ifdef CONFIG_GLUSTERFS_DISCARD
70
file);
60
.bdrv_co_pdiscard = qemu_gluster_co_pdiscard,
71
if (ret < 0) {
61
#endif
72
- break;
62
@@ -XXX,XX +XXX,XX @@ static BlockDriver bdrv_gluster_tcp = {
73
+ return ret;
63
.bdrv_co_writev = qemu_gluster_co_writev,
74
}
64
.bdrv_co_flush_to_disk = qemu_gluster_co_flush_to_disk,
75
- if (ret & BDRV_BLOCK_ZERO && ret & BDRV_BLOCK_EOF && !first) {
65
.bdrv_has_zero_init = qemu_gluster_has_zero_init,
76
+ if (*pnum == 0) {
66
+ .bdrv_has_zero_init_truncate = qemu_gluster_has_zero_init,
77
/*
67
#ifdef CONFIG_GLUSTERFS_DISCARD
78
- * Reading beyond the end of the file continues to read
68
.bdrv_co_pdiscard = qemu_gluster_co_pdiscard,
79
- * zeroes, but we can only widen the result to the
69
#endif
80
- * unallocated length we learned from an earlier
70
@@ -XXX,XX +XXX,XX @@ static BlockDriver bdrv_gluster_unix = {
81
- * iteration.
71
.bdrv_co_writev = qemu_gluster_co_writev,
82
+ * The top layer deferred to this layer, and because this layer is
72
.bdrv_co_flush_to_disk = qemu_gluster_co_flush_to_disk,
83
+ * short, any zeroes that we synthesize beyond EOF behave as if they
73
.bdrv_has_zero_init = qemu_gluster_has_zero_init,
84
+ * were allocated at this layer.
74
+ .bdrv_has_zero_init_truncate = qemu_gluster_has_zero_init,
85
+ *
75
#ifdef CONFIG_GLUSTERFS_DISCARD
86
+ * We don't include BDRV_BLOCK_EOF into ret, as upper layer may be
76
.bdrv_co_pdiscard = qemu_gluster_co_pdiscard,
87
+ * larger. We'll add BDRV_BLOCK_EOF if needed at function end, see
77
#endif
88
+ * below.
78
@@ -XXX,XX +XXX,XX @@ static BlockDriver bdrv_gluster_rdma = {
89
*/
79
.bdrv_co_writev = qemu_gluster_co_writev,
90
+ assert(ret & BDRV_BLOCK_EOF);
80
.bdrv_co_flush_to_disk = qemu_gluster_co_flush_to_disk,
91
*pnum = bytes;
81
.bdrv_has_zero_init = qemu_gluster_has_zero_init,
92
+ if (file) {
82
+ .bdrv_has_zero_init_truncate = qemu_gluster_has_zero_init,
93
+ *file = p;
83
#ifdef CONFIG_GLUSTERFS_DISCARD
94
+ }
84
.bdrv_co_pdiscard = qemu_gluster_co_pdiscard,
95
+ ret = BDRV_BLOCK_ZERO | BDRV_BLOCK_ALLOCATED;
85
#endif
96
+ break;
86
diff --git a/block/nfs.c b/block/nfs.c
97
}
87
index XXXXXXX..XXXXXXX 100644
98
- if (ret & (BDRV_BLOCK_ZERO | BDRV_BLOCK_DATA)) {
88
--- a/block/nfs.c
99
+ if (ret & BDRV_BLOCK_ALLOCATED) {
89
+++ b/block/nfs.c
100
+ /*
90
@@ -XXX,XX +XXX,XX @@ static BlockDriver bdrv_nfs = {
101
+ * We've found the node and the status, we must break.
91
.create_opts = &nfs_create_opts,
102
+ *
92
103
+ * Drop BDRV_BLOCK_EOF, as it's not for upper layer, which may be
93
.bdrv_has_zero_init = nfs_has_zero_init,
104
+ * larger. We'll add BDRV_BLOCK_EOF if needed at function end, see
94
+ .bdrv_has_zero_init_truncate = nfs_has_zero_init,
105
+ * below.
95
.bdrv_get_allocated_file_size = nfs_get_allocated_file_size,
106
+ */
96
.bdrv_co_truncate = nfs_file_co_truncate,
107
+ ret &= ~BDRV_BLOCK_EOF;
108
break;
109
}
110
- /* [offset, pnum] unallocated on this layer, which could be only
111
- * the first part of [offset, bytes]. */
112
- bytes = MIN(bytes, *pnum);
113
- first = false;
114
+
115
+ /*
116
+ * OK, [offset, offset + *pnum) region is unallocated on this layer,
117
+ * let's continue the diving.
118
+ */
119
+ assert(*pnum <= bytes);
120
+ bytes = *pnum;
121
+ }
122
+
123
+ if (offset + *pnum == eof) {
124
+ ret |= BDRV_BLOCK_EOF;
125
}
126
+
127
return ret;
128
}
97
129
98
diff --git a/block/qcow2.c b/block/qcow2.c
130
diff --git a/block/qcow2.c b/block/qcow2.c
99
index XXXXXXX..XXXXXXX 100644
131
index XXXXXXX..XXXXXXX 100644
100
--- a/block/qcow2.c
132
--- a/block/qcow2.c
101
+++ b/block/qcow2.c
133
+++ b/block/qcow2.c
102
@@ -XXX,XX +XXX,XX @@ BlockDriver bdrv_qcow2 = {
134
@@ -XXX,XX +XXX,XX @@ static bool is_zero(BlockDriverState *bs, int64_t offset, int64_t bytes)
103
.bdrv_co_create_opts = qcow2_co_create_opts,
135
if (!bytes) {
104
.bdrv_co_create = qcow2_co_create,
136
return true;
105
.bdrv_has_zero_init = bdrv_has_zero_init_1,
137
}
106
+ .bdrv_has_zero_init_truncate = bdrv_has_zero_init_1,
138
- res = bdrv_block_status_above(bs, NULL, offset, bytes, &nr, NULL, NULL);
107
.bdrv_co_block_status = qcow2_co_block_status,
139
- return res >= 0 && (res & BDRV_BLOCK_ZERO) && nr == bytes;
108
140
+
109
.bdrv_co_preadv = qcow2_co_preadv,
141
+ /*
110
diff --git a/block/qed.c b/block/qed.c
142
+ * bdrv_block_status_above doesn't merge different types of zeros, for
111
index XXXXXXX..XXXXXXX 100644
143
+ * example, zeros which come from the region which is unallocated in
112
--- a/block/qed.c
144
+ * the whole backing chain, and zeros which come because of a short
113
+++ b/block/qed.c
145
+ * backing file. So, we need a loop.
114
@@ -XXX,XX +XXX,XX @@ static BlockDriver bdrv_qed = {
146
+ */
115
.bdrv_co_create = bdrv_qed_co_create,
147
+ do {
116
.bdrv_co_create_opts = bdrv_qed_co_create_opts,
148
+ res = bdrv_block_status_above(bs, NULL, offset, bytes, &nr, NULL, NULL);
117
.bdrv_has_zero_init = bdrv_has_zero_init_1,
149
+ offset += nr;
118
+ .bdrv_has_zero_init_truncate = bdrv_has_zero_init_1,
150
+ bytes -= nr;
119
.bdrv_co_block_status = bdrv_qed_co_block_status,
151
+ } while (res >= 0 && (res & BDRV_BLOCK_ZERO) && nr && bytes);
120
.bdrv_co_readv = bdrv_qed_co_readv,
152
+
121
.bdrv_co_writev = bdrv_qed_co_writev,
153
+ return res >= 0 && (res & BDRV_BLOCK_ZERO) && bytes == 0;
122
diff --git a/block/raw-format.c b/block/raw-format.c
123
index XXXXXXX..XXXXXXX 100644
124
--- a/block/raw-format.c
125
+++ b/block/raw-format.c
126
@@ -XXX,XX +XXX,XX @@ static int raw_has_zero_init(BlockDriverState *bs)
127
return bdrv_has_zero_init(bs->file->bs);
128
}
154
}
129
155
130
+static int raw_has_zero_init_truncate(BlockDriverState *bs)
156
static coroutine_fn int qcow2_co_pwrite_zeroes(BlockDriverState *bs,
131
+{
132
+ return bdrv_has_zero_init_truncate(bs->file->bs);
133
+}
134
+
135
static int coroutine_fn raw_co_create_opts(const char *filename, QemuOpts *opts,
136
Error **errp)
137
{
138
@@ -XXX,XX +XXX,XX @@ BlockDriver bdrv_raw = {
139
.bdrv_co_ioctl = &raw_co_ioctl,
140
.create_opts = &raw_create_opts,
141
.bdrv_has_zero_init = &raw_has_zero_init,
142
+ .bdrv_has_zero_init_truncate = &raw_has_zero_init_truncate,
143
.strong_runtime_opts = raw_strong_runtime_opts,
144
.mutable_opts = mutable_opts,
145
};
146
diff --git a/block/rbd.c b/block/rbd.c
147
index XXXXXXX..XXXXXXX 100644
148
--- a/block/rbd.c
149
+++ b/block/rbd.c
150
@@ -XXX,XX +XXX,XX @@ static BlockDriver bdrv_rbd = {
151
.bdrv_co_create = qemu_rbd_co_create,
152
.bdrv_co_create_opts = qemu_rbd_co_create_opts,
153
.bdrv_has_zero_init = bdrv_has_zero_init_1,
154
+ .bdrv_has_zero_init_truncate = bdrv_has_zero_init_1,
155
.bdrv_get_info = qemu_rbd_getinfo,
156
.create_opts = &qemu_rbd_create_opts,
157
.bdrv_getlength = qemu_rbd_getlength,
158
diff --git a/block/sheepdog.c b/block/sheepdog.c
159
index XXXXXXX..XXXXXXX 100644
160
--- a/block/sheepdog.c
161
+++ b/block/sheepdog.c
162
@@ -XXX,XX +XXX,XX @@ static BlockDriver bdrv_sheepdog = {
163
.bdrv_co_create = sd_co_create,
164
.bdrv_co_create_opts = sd_co_create_opts,
165
.bdrv_has_zero_init = bdrv_has_zero_init_1,
166
+ .bdrv_has_zero_init_truncate = bdrv_has_zero_init_1,
167
.bdrv_getlength = sd_getlength,
168
.bdrv_get_allocated_file_size = sd_get_allocated_file_size,
169
.bdrv_co_truncate = sd_co_truncate,
170
diff --git a/block/ssh.c b/block/ssh.c
171
index XXXXXXX..XXXXXXX 100644
172
--- a/block/ssh.c
173
+++ b/block/ssh.c
174
@@ -XXX,XX +XXX,XX @@ static BlockDriver bdrv_ssh = {
175
.bdrv_co_create_opts = ssh_co_create_opts,
176
.bdrv_close = ssh_close,
177
.bdrv_has_zero_init = ssh_has_zero_init,
178
+ .bdrv_has_zero_init_truncate = ssh_has_zero_init,
179
.bdrv_co_readv = ssh_co_readv,
180
.bdrv_co_writev = ssh_co_writev,
181
.bdrv_getlength = ssh_getlength,
182
--
157
--
183
2.21.0
158
2.26.2
184
159
185
diff view generated by jsdifflib
1
69f47505ee has changed qcow2 in such a way that the commit job run in
1
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2
test 141 (and 144[1]) returns before it emits the READY event. However,
3
141 also runs with qed, where the order is still the other way around.
4
Just filter out the {"return": {}} so the test passes for qed again.
5
2
6
[1] 144 only runs with qcow2, so it is fine as it is.
3
In order to reuse bdrv_common_block_status_above in
4
bdrv_is_allocated_above, let's support include_base parameter.
7
5
8
Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
6
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
9
Fixes: 69f47505ee66afaa513305de0c1895a224e52c45
7
Reviewed-by: Alberto Garcia <berto@igalia.com>
10
Signed-off-by: Max Reitz <mreitz@redhat.com>
8
Reviewed-by: Eric Blake <eblake@redhat.com>
11
Message-id: 20190809185253.17535-1-mreitz@redhat.com
9
Message-id: 20200924194003.22080-3-vsementsov@virtuozzo.com
12
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
10
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
13
Reviewed-by: John Snow <jsnow@redhat.com>
14
Signed-off-by: Max Reitz <mreitz@redhat.com>
15
---
11
---
16
tests/qemu-iotests/141 | 9 +++++++--
12
block/coroutines.h | 2 ++
17
tests/qemu-iotests/141.out | 5 -----
13
block/io.c | 21 ++++++++++++++-------
18
tests/qemu-iotests/common.filter | 5 +++++
14
2 files changed, 16 insertions(+), 7 deletions(-)
19
3 files changed, 12 insertions(+), 7 deletions(-)
20
15
21
diff --git a/tests/qemu-iotests/141 b/tests/qemu-iotests/141
16
diff --git a/block/coroutines.h b/block/coroutines.h
22
index XXXXXXX..XXXXXXX 100755
23
--- a/tests/qemu-iotests/141
24
+++ b/tests/qemu-iotests/141
25
@@ -XXX,XX +XXX,XX @@ test_blockjob()
26
}}}" \
27
'return'
28
29
+ # If "$2" is an event, we may or may not see it before the
30
+ # {"return": {}}. Therefore, filter the {"return": {}} out both
31
+ # here and in the next command. (Naturally, if we do not see it
32
+ # here, we will see it before the next command can be executed,
33
+ # so it will appear in the next _send_qemu_cmd's output.)
34
_send_qemu_cmd $QEMU_HANDLE \
35
"$1" \
36
"$2" \
37
- | _filter_img_create
38
+ | _filter_img_create | _filter_qmp_empty_return
39
40
# We want this to return an error because the block job is still running
41
_send_qemu_cmd $QEMU_HANDLE \
42
"{'execute': 'blockdev-del',
43
'arguments': {'node-name': 'drv0'}}" \
44
- 'error' | _filter_generated_node_ids
45
+ 'error' | _filter_generated_node_ids | _filter_qmp_empty_return
46
47
_send_qemu_cmd $QEMU_HANDLE \
48
"{'execute': 'block-job-cancel',
49
diff --git a/tests/qemu-iotests/141.out b/tests/qemu-iotests/141.out
50
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
51
--- a/tests/qemu-iotests/141.out
18
--- a/block/coroutines.h
52
+++ b/tests/qemu-iotests/141.out
19
+++ b/block/coroutines.h
53
@@ -XXX,XX +XXX,XX @@ Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576 backing_file=TEST_DIR/m.
20
@@ -XXX,XX +XXX,XX @@ bdrv_pwritev(BdrvChild *child, int64_t offset, unsigned int bytes,
54
Formatting 'TEST_DIR/o.IMGFMT', fmt=IMGFMT size=1048576 backing_file=TEST_DIR/t.IMGFMT backing_fmt=IMGFMT
21
int coroutine_fn
55
{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "job0"}}
22
bdrv_co_common_block_status_above(BlockDriverState *bs,
56
{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job0"}}
23
BlockDriverState *base,
57
-{"return": {}}
24
+ bool include_base,
58
{"error": {"class": "GenericError", "desc": "Node drv0 is in use"}}
25
bool want_zero,
59
{"return": {}}
26
int64_t offset,
60
{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "aborting", "id": "job0"}}
27
int64_t bytes,
61
@@ -XXX,XX +XXX,XX @@ Formatting 'TEST_DIR/o.IMGFMT', fmt=IMGFMT size=1048576 backing_file=TEST_DIR/t.
28
@@ -XXX,XX +XXX,XX @@ bdrv_co_common_block_status_above(BlockDriverState *bs,
62
{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job0"}}
29
int generated_co_wrapper
63
{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "ready", "id": "job0"}}
30
bdrv_common_block_status_above(BlockDriverState *bs,
64
{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "job0", "len": 0, "offset": 0, "speed": 0, "type": "mirror"}}
31
BlockDriverState *base,
65
-{"return": {}}
32
+ bool include_base,
66
{"error": {"class": "GenericError", "desc": "Node 'drv0' is busy: block device is in use by block job: mirror"}}
33
bool want_zero,
67
{"return": {}}
34
int64_t offset,
68
{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "job0"}}
35
int64_t bytes,
69
@@ -XXX,XX +XXX,XX @@ Formatting 'TEST_DIR/o.IMGFMT', fmt=IMGFMT size=1048576 backing_file=TEST_DIR/t.
36
diff --git a/block/io.c b/block/io.c
70
{"return": {}}
71
{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "job0"}}
72
{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job0"}}
73
-{"return": {}}
74
{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "ready", "id": "job0"}}
75
{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "job0", "len": 0, "offset": 0, "speed": 0, "type": "commit"}}
76
{"error": {"class": "GenericError", "desc": "Node 'drv0' is busy: block device is in use by block job: commit"}}
77
@@ -XXX,XX +XXX,XX @@ wrote 1048576/1048576 bytes at offset 0
78
{"return": {}}
79
{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "job0"}}
80
{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job0"}}
81
-{"return": {}}
82
{"error": {"class": "GenericError", "desc": "Node drv0 is in use"}}
83
{"return": {}}
84
{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "aborting", "id": "job0"}}
85
@@ -XXX,XX +XXX,XX @@ wrote 1048576/1048576 bytes at offset 0
86
{"return": {}}
87
{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "job0"}}
88
{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job0"}}
89
-{"return": {}}
90
{"error": {"class": "GenericError", "desc": "Node drv0 is in use"}}
91
{"return": {}}
92
{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "aborting", "id": "job0"}}
93
diff --git a/tests/qemu-iotests/common.filter b/tests/qemu-iotests/common.filter
94
index XXXXXXX..XXXXXXX 100644
37
index XXXXXXX..XXXXXXX 100644
95
--- a/tests/qemu-iotests/common.filter
38
--- a/block/io.c
96
+++ b/tests/qemu-iotests/common.filter
39
+++ b/block/io.c
97
@@ -XXX,XX +XXX,XX @@ _filter_nbd()
40
@@ -XXX,XX +XXX,XX @@ early_out:
98
-e 's#\(foo\|PORT/\?\|.sock\): Failed to .*$#\1#'
41
int coroutine_fn
42
bdrv_co_common_block_status_above(BlockDriverState *bs,
43
BlockDriverState *base,
44
+ bool include_base,
45
bool want_zero,
46
int64_t offset,
47
int64_t bytes,
48
@@ -XXX,XX +XXX,XX @@ bdrv_co_common_block_status_above(BlockDriverState *bs,
49
BlockDriverState *p;
50
int64_t eof = 0;
51
52
- assert(bs != base);
53
+ assert(include_base || bs != base);
54
+ assert(!include_base || base); /* Can't include NULL base */
55
56
ret = bdrv_co_block_status(bs, want_zero, offset, bytes, pnum, map, file);
57
- if (ret < 0 || *pnum == 0 || ret & BDRV_BLOCK_ALLOCATED) {
58
+ if (ret < 0 || *pnum == 0 || ret & BDRV_BLOCK_ALLOCATED || bs == base) {
59
return ret;
60
}
61
62
@@ -XXX,XX +XXX,XX @@ bdrv_co_common_block_status_above(BlockDriverState *bs,
63
assert(*pnum <= bytes);
64
bytes = *pnum;
65
66
- for (p = bdrv_filter_or_cow_bs(bs); p != base;
67
+ for (p = bdrv_filter_or_cow_bs(bs); include_base || p != base;
68
p = bdrv_filter_or_cow_bs(p))
69
{
70
ret = bdrv_co_block_status(p, want_zero, offset, bytes, pnum, map,
71
@@ -XXX,XX +XXX,XX @@ bdrv_co_common_block_status_above(BlockDriverState *bs,
72
break;
73
}
74
75
+ if (p == base) {
76
+ assert(include_base);
77
+ break;
78
+ }
79
+
80
/*
81
* OK, [offset, offset + *pnum) region is unallocated on this layer,
82
* let's continue the diving.
83
@@ -XXX,XX +XXX,XX @@ int bdrv_block_status_above(BlockDriverState *bs, BlockDriverState *base,
84
int64_t offset, int64_t bytes, int64_t *pnum,
85
int64_t *map, BlockDriverState **file)
86
{
87
- return bdrv_common_block_status_above(bs, base, true, offset, bytes,
88
+ return bdrv_common_block_status_above(bs, base, false, true, offset, bytes,
89
pnum, map, file);
99
}
90
}
100
91
101
+_filter_qmp_empty_return()
92
@@ -XXX,XX +XXX,XX @@ int coroutine_fn bdrv_is_allocated(BlockDriverState *bs, int64_t offset,
102
+{
93
int ret;
103
+ grep -v '{"return": {}}'
94
int64_t dummy;
104
+}
95
105
+
96
- ret = bdrv_common_block_status_above(bs, bdrv_filter_or_cow_bs(bs), false,
106
# make sure this script returns success
97
- offset, bytes, pnum ? pnum : &dummy,
107
true
98
- NULL, NULL);
99
+ ret = bdrv_common_block_status_above(bs, bs, true, false, offset,
100
+ bytes, pnum ? pnum : &dummy, NULL,
101
+ NULL);
102
if (ret < 0) {
103
return ret;
104
}
108
--
105
--
109
2.21.0
106
2.26.2
110
107
111
diff view generated by jsdifflib
1
No .bdrv_has_zero_init() implementation returns 1 if growing the file
1
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2
would add non-zero areas (at least with PREALLOC_MODE_OFF), so using it
3
in lieu of this new function was always safe.
4
2
5
But on the other hand, it is possible that growing an image that is not
3
We are going to reuse bdrv_common_block_status_above in
6
zero-initialized would still add a zero-initialized area, like when
4
bdrv_is_allocated_above. bdrv_is_allocated_above may be called with
7
using nonpreallocating truncation on a preallocated image. For callers
5
include_base == false and still bs == base (for ex. from img_rebase()).
8
that care only about truncation, not about creation with potential
9
preallocation, this new function is useful.
10
6
11
Alternatively, we could have added a PreallocMode parameter to
7
So, support this corner case.
12
bdrv_has_zero_init(). But the only user would have been qemu-img
13
convert, which does not have a plain PreallocMode value right now -- it
14
would have to parse the creation option to obtain it. Therefore, the
15
simpler solution is to let bdrv_has_zero_init() inquire the
16
preallocation status and add the new bdrv_has_zero_init_truncate() that
17
presupposes PREALLOC_MODE_OFF.
18
8
19
Signed-off-by: Max Reitz <mreitz@redhat.com>
9
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
20
Message-id: 20190724171239.8764-4-mreitz@redhat.com
10
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
21
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
11
Reviewed-by: Eric Blake <eblake@redhat.com>
22
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
12
Reviewed-by: Alberto Garcia <berto@igalia.com>
23
Signed-off-by: Max Reitz <mreitz@redhat.com>
13
Message-id: 20200924194003.22080-4-vsementsov@virtuozzo.com
14
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
24
---
15
---
25
include/block/block.h | 1 +
16
block/io.c | 6 +++++-
26
include/block/block_int.h | 7 +++++++
17
1 file changed, 5 insertions(+), 1 deletion(-)
27
block.c | 21 +++++++++++++++++++++
28
3 files changed, 29 insertions(+)
29
18
30
diff --git a/include/block/block.h b/include/block/block.h
19
diff --git a/block/io.c b/block/io.c
31
index XXXXXXX..XXXXXXX 100644
20
index XXXXXXX..XXXXXXX 100644
32
--- a/include/block/block.h
21
--- a/block/io.c
33
+++ b/include/block/block.h
22
+++ b/block/io.c
34
@@ -XXX,XX +XXX,XX @@ int bdrv_pdiscard(BdrvChild *child, int64_t offset, int64_t bytes);
23
@@ -XXX,XX +XXX,XX @@ bdrv_co_common_block_status_above(BlockDriverState *bs,
35
int bdrv_co_pdiscard(BdrvChild *child, int64_t offset, int64_t bytes);
24
BlockDriverState *p;
36
int bdrv_has_zero_init_1(BlockDriverState *bs);
25
int64_t eof = 0;
37
int bdrv_has_zero_init(BlockDriverState *bs);
26
38
+int bdrv_has_zero_init_truncate(BlockDriverState *bs);
27
- assert(include_base || bs != base);
39
bool bdrv_unallocated_blocks_are_zero(BlockDriverState *bs);
28
assert(!include_base || base); /* Can't include NULL base */
40
bool bdrv_can_write_zeroes_with_unmap(BlockDriverState *bs);
29
41
int bdrv_block_status(BlockDriverState *bs, int64_t offset,
30
+ if (!include_base && bs == base) {
42
diff --git a/include/block/block_int.h b/include/block/block_int.h
31
+ *pnum = bytes;
43
index XXXXXXX..XXXXXXX 100644
44
--- a/include/block/block_int.h
45
+++ b/include/block/block_int.h
46
@@ -XXX,XX +XXX,XX @@ struct BlockDriver {
47
/*
48
* Returns 1 if newly created images are guaranteed to contain only
49
* zeros, 0 otherwise.
50
+ * Must return 0 if .bdrv_has_zero_init_truncate() returns 0.
51
*/
52
int (*bdrv_has_zero_init)(BlockDriverState *bs);
53
54
+ /*
55
+ * Returns 1 if new areas added by growing the image with
56
+ * PREALLOC_MODE_OFF contain only zeros, 0 otherwise.
57
+ */
58
+ int (*bdrv_has_zero_init_truncate)(BlockDriverState *bs);
59
+
60
/* Remove fd handlers, timers, and other event loop callbacks so the event
61
* loop is no longer in use. Called with no in-flight requests and in
62
* depth-first traversal order with parents before child nodes.
63
diff --git a/block.c b/block.c
64
index XXXXXXX..XXXXXXX 100644
65
--- a/block.c
66
+++ b/block.c
67
@@ -XXX,XX +XXX,XX @@ int bdrv_has_zero_init(BlockDriverState *bs)
68
return 0;
69
}
70
71
+int bdrv_has_zero_init_truncate(BlockDriverState *bs)
72
+{
73
+ if (!bs->drv) {
74
+ return 0;
32
+ return 0;
75
+ }
33
+ }
76
+
34
+
77
+ if (bs->backing) {
35
ret = bdrv_co_block_status(bs, want_zero, offset, bytes, pnum, map, file);
78
+ /* Depends on the backing image length, but better safe than sorry */
36
if (ret < 0 || *pnum == 0 || ret & BDRV_BLOCK_ALLOCATED || bs == base) {
79
+ return 0;
37
return ret;
80
+ }
81
+ if (bs->drv->bdrv_has_zero_init_truncate) {
82
+ return bs->drv->bdrv_has_zero_init_truncate(bs);
83
+ }
84
+ if (bs->file && bs->drv->is_filter) {
85
+ return bdrv_has_zero_init_truncate(bs->file->bs);
86
+ }
87
+
88
+ /* safe default */
89
+ return 0;
90
+}
91
+
92
bool bdrv_unallocated_blocks_are_zero(BlockDriverState *bs)
93
{
94
BlockDriverInfo bdi;
95
--
38
--
96
2.21.0
39
2.26.2
97
40
98
diff view generated by jsdifflib
1
vhdx and parallels call bdrv_has_zero_init() when they do not really
1
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2
care about an image's post-create state but only about what happens when
3
you grow an image. That is a bit ugly, and also overly safe when
4
growing preallocated images without preallocating the new areas.
5
2
6
Let them use bdrv_has_zero_init_truncate() instead.
3
bdrv_is_allocated_above wrongly handles short backing files: it reports
4
after-EOF space as UNALLOCATED which is wrong, as on read the data is
5
generated on the level of short backing file (if all overlays have
6
unallocated areas at that place).
7
7
8
Signed-off-by: Max Reitz <mreitz@redhat.com>
8
Reusing bdrv_common_block_status_above fixes the issue and unifies code
9
Message-id: 20190724171239.8764-6-mreitz@redhat.com
9
path.
10
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
10
11
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
11
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
12
[mreitz: Added commit message]
12
Reviewed-by: Eric Blake <eblake@redhat.com>
13
Signed-off-by: Max Reitz <mreitz@redhat.com>
13
Reviewed-by: Alberto Garcia <berto@igalia.com>
14
Message-id: 20200924194003.22080-5-vsementsov@virtuozzo.com
15
[Fix s/has/have/ as suggested by Eric Blake. Fix s/area/areas/.
16
--Stefan]
17
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
14
---
18
---
15
block/parallels.c | 2 +-
19
block/io.c | 43 +++++--------------------------------------
16
block/vhdx.c | 2 +-
20
1 file changed, 5 insertions(+), 38 deletions(-)
17
2 files changed, 2 insertions(+), 2 deletions(-)
18
21
19
diff --git a/block/parallels.c b/block/parallels.c
22
diff --git a/block/io.c b/block/io.c
20
index XXXXXXX..XXXXXXX 100644
23
index XXXXXXX..XXXXXXX 100644
21
--- a/block/parallels.c
24
--- a/block/io.c
22
+++ b/block/parallels.c
25
+++ b/block/io.c
23
@@ -XXX,XX +XXX,XX @@ static int parallels_open(BlockDriverState *bs, QDict *options, int flags,
26
@@ -XXX,XX +XXX,XX @@ int coroutine_fn bdrv_is_allocated(BlockDriverState *bs, int64_t offset,
24
goto fail_options;
27
* at 'offset + *pnum' may return the same allocation status (in other
28
* words, the result is not necessarily the maximum possible range);
29
* but 'pnum' will only be 0 when end of file is reached.
30
- *
31
*/
32
int bdrv_is_allocated_above(BlockDriverState *top,
33
BlockDriverState *base,
34
bool include_base, int64_t offset,
35
int64_t bytes, int64_t *pnum)
36
{
37
- BlockDriverState *intermediate;
38
- int ret;
39
- int64_t n = bytes;
40
-
41
- assert(base || !include_base);
42
-
43
- intermediate = top;
44
- while (include_base || intermediate != base) {
45
- int64_t pnum_inter;
46
- int64_t size_inter;
47
-
48
- assert(intermediate);
49
- ret = bdrv_is_allocated(intermediate, offset, bytes, &pnum_inter);
50
- if (ret < 0) {
51
- return ret;
52
- }
53
- if (ret) {
54
- *pnum = pnum_inter;
55
- return 1;
56
- }
57
-
58
- size_inter = bdrv_getlength(intermediate);
59
- if (size_inter < 0) {
60
- return size_inter;
61
- }
62
- if (n > pnum_inter &&
63
- (intermediate == top || offset + pnum_inter < size_inter)) {
64
- n = pnum_inter;
65
- }
66
-
67
- if (intermediate == base) {
68
- break;
69
- }
70
-
71
- intermediate = bdrv_filter_or_cow_bs(intermediate);
72
+ int ret = bdrv_common_block_status_above(top, base, include_base, false,
73
+ offset, bytes, pnum, NULL, NULL);
74
+ if (ret < 0) {
75
+ return ret;
25
}
76
}
26
77
27
- if (!bdrv_has_zero_init(bs->file->bs)) {
78
- *pnum = n;
28
+ if (!bdrv_has_zero_init_truncate(bs->file->bs)) {
79
- return 0;
29
s->prealloc_mode = PRL_PREALLOC_MODE_FALLOCATE;
80
+ return !!(ret & BDRV_BLOCK_ALLOCATED);
30
}
81
}
31
82
32
diff --git a/block/vhdx.c b/block/vhdx.c
83
int coroutine_fn
33
index XXXXXXX..XXXXXXX 100644
34
--- a/block/vhdx.c
35
+++ b/block/vhdx.c
36
@@ -XXX,XX +XXX,XX @@ static coroutine_fn int vhdx_co_writev(BlockDriverState *bs, int64_t sector_num,
37
/* Queue another write of zero buffers if the underlying file
38
* does not zero-fill on file extension */
39
40
- if (bdrv_has_zero_init(bs->file->bs) == 0) {
41
+ if (bdrv_has_zero_init_truncate(bs->file->bs) == 0) {
42
use_zero_buffers = true;
43
44
/* zero fill the front, if any */
45
--
84
--
46
2.21.0
85
2.26.2
47
86
48
diff view generated by jsdifflib
1
Add a test case for converting an empty image (which only returns zeroes
1
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2
when read) to a preallocated encrypted qcow2 image.
3
qcow2_has_zero_init() should return 0 then, thus forcing qemu-img
4
convert to create zero clusters.
5
2
6
Signed-off-by: Max Reitz <mreitz@redhat.com>
3
These cases are fixed by previous patches around block_status and
7
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
4
is_allocated.
8
Tested-by: Stefano Garzarella <sgarzare@redhat.com>
5
9
Message-id: 20190724171239.8764-10-mreitz@redhat.com
6
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
10
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
7
Reviewed-by: Eric Blake <eblake@redhat.com>
11
Signed-off-by: Max Reitz <mreitz@redhat.com>
8
Reviewed-by: Alberto Garcia <berto@igalia.com>
9
Message-id: 20200924194003.22080-6-vsementsov@virtuozzo.com
10
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
12
---
11
---
13
tests/qemu-iotests/188 | 20 +++++++++++++++++++-
12
tests/qemu-iotests/274 | 20 +++++++++++
14
tests/qemu-iotests/188.out | 4 ++++
13
tests/qemu-iotests/274.out | 68 ++++++++++++++++++++++++++++++++++++++
15
2 files changed, 23 insertions(+), 1 deletion(-)
14
2 files changed, 88 insertions(+)
16
15
17
diff --git a/tests/qemu-iotests/188 b/tests/qemu-iotests/188
16
diff --git a/tests/qemu-iotests/274 b/tests/qemu-iotests/274
18
index XXXXXXX..XXXXXXX 100755
17
index XXXXXXX..XXXXXXX 100755
19
--- a/tests/qemu-iotests/188
18
--- a/tests/qemu-iotests/274
20
+++ b/tests/qemu-iotests/188
19
+++ b/tests/qemu-iotests/274
21
@@ -XXX,XX +XXX,XX @@ SECRETALT="secret,id=sec0,data=platypus"
20
@@ -XXX,XX +XXX,XX @@ with iotests.FilePath('base') as base, \
22
21
iotests.qemu_io_log('-c', 'read -P 1 0 %d' % size_short, mid)
23
_make_test_img --object $SECRET -o "encrypt.format=luks,encrypt.key-secret=sec0,encrypt.iter-time=10" $size
22
iotests.qemu_io_log('-c', 'read -P 0 %d %d' % (size_short, size_diff), mid)
24
23
25
-IMGSPEC="driver=$IMGFMT,file.filename=$TEST_IMG,encrypt.key-secret=sec0"
24
+ iotests.log('=== Testing qemu-img commit (top -> base) ===')
26
+IMGSPEC="driver=$IMGFMT,encrypt.key-secret=sec0,file.filename=$TEST_IMG"
27
28
QEMU_IO_OPTIONS=$QEMU_IO_OPTIONS_NO_FMT
29
30
@@ -XXX,XX +XXX,XX @@ echo
31
echo "== verify open failure with wrong password =="
32
$QEMU_IO --object $SECRETALT -c "read -P 0xa 0 $size" --image-opts $IMGSPEC | _filter_qemu_io | _filter_testdir
33
34
+_cleanup_test_img
35
+
25
+
36
+echo
26
+ create_chain()
37
+echo "== verify that has_zero_init returns false when preallocating =="
27
+ iotests.qemu_img_log('commit', '-b', base, top)
28
+ iotests.img_info_log(base)
29
+ iotests.qemu_io_log('-c', 'read -P 1 0 %d' % size_short, base)
30
+ iotests.qemu_io_log('-c', 'read -P 0 %d %d' % (size_short, size_diff), base)
38
+
31
+
39
+# Empty source file
32
+ iotests.log('=== Testing QMP active commit (top -> base) ===')
40
+if [ -n "$TEST_IMG_FILE" ]; then
41
+ TEST_IMG_FILE="${TEST_IMG_FILE}.orig" _make_test_img $size
42
+else
43
+ TEST_IMG="${TEST_IMG}.orig" _make_test_img $size
44
+fi
45
+
33
+
46
+$QEMU_IMG convert -O "$IMGFMT" --object $SECRET \
34
+ create_chain()
47
+ -o "encrypt.format=luks,encrypt.key-secret=sec0,encrypt.iter-time=10,preallocation=metadata" \
35
+ with create_vm() as vm:
48
+ "${TEST_IMG}.orig" "$TEST_IMG"
36
+ vm.launch()
37
+ vm.qmp_log('block-commit', device='top', base_node='base',
38
+ job_id='job0', auto_dismiss=False)
39
+ vm.run_job('job0', wait=5)
49
+
40
+
50
+$QEMU_IMG compare --object $SECRET --image-opts "${IMGSPEC}.orig" "$IMGSPEC"
41
+ iotests.img_info_log(mid)
42
+ iotests.qemu_io_log('-c', 'read -P 1 0 %d' % size_short, base)
43
+ iotests.qemu_io_log('-c', 'read -P 0 %d %d' % (size_short, size_diff), base)
44
45
iotests.log('== Resize tests ==')
46
47
diff --git a/tests/qemu-iotests/274.out b/tests/qemu-iotests/274.out
48
index XXXXXXX..XXXXXXX 100644
49
--- a/tests/qemu-iotests/274.out
50
+++ b/tests/qemu-iotests/274.out
51
@@ -XXX,XX +XXX,XX @@ read 1048576/1048576 bytes at offset 0
52
read 1048576/1048576 bytes at offset 1048576
53
1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
54
55
+=== Testing qemu-img commit (top -> base) ===
56
+Formatting 'TEST_DIR/PID-base', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=2097152 lazy_refcounts=off refcount_bits=16
51
+
57
+
52
58
+Formatting 'TEST_DIR/PID-mid', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=1048576 backing_file=TEST_DIR/PID-base backing_fmt=qcow2 lazy_refcounts=off refcount_bits=16
53
# success, all done
54
echo "*** done"
55
diff --git a/tests/qemu-iotests/188.out b/tests/qemu-iotests/188.out
56
index XXXXXXX..XXXXXXX 100644
57
--- a/tests/qemu-iotests/188.out
58
+++ b/tests/qemu-iotests/188.out
59
@@ -XXX,XX +XXX,XX @@ read 16777216/16777216 bytes at offset 0
60
61
== verify open failure with wrong password ==
62
qemu-io: can't open: Invalid password, cannot unlock any keyslot
63
+
59
+
64
+== verify that has_zero_init returns false when preallocating ==
60
+Formatting 'TEST_DIR/PID-top', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=2097152 backing_file=TEST_DIR/PID-mid backing_fmt=qcow2 lazy_refcounts=off refcount_bits=16
65
+Formatting 'TEST_DIR/t.IMGFMT.orig', fmt=IMGFMT size=16777216
61
+
66
+Images are identical.
62
+wrote 2097152/2097152 bytes at offset 0
67
*** done
63
+2 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
64
+
65
+Image committed.
66
+
67
+image: TEST_IMG
68
+file format: IMGFMT
69
+virtual size: 2 MiB (2097152 bytes)
70
+cluster_size: 65536
71
+Format specific information:
72
+ compat: 1.1
73
+ compression type: zlib
74
+ lazy refcounts: false
75
+ refcount bits: 16
76
+ corrupt: false
77
+ extended l2: false
78
+
79
+read 1048576/1048576 bytes at offset 0
80
+1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
81
+
82
+read 1048576/1048576 bytes at offset 1048576
83
+1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
84
+
85
+=== Testing QMP active commit (top -> base) ===
86
+Formatting 'TEST_DIR/PID-base', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=2097152 lazy_refcounts=off refcount_bits=16
87
+
88
+Formatting 'TEST_DIR/PID-mid', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=1048576 backing_file=TEST_DIR/PID-base backing_fmt=qcow2 lazy_refcounts=off refcount_bits=16
89
+
90
+Formatting 'TEST_DIR/PID-top', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=2097152 backing_file=TEST_DIR/PID-mid backing_fmt=qcow2 lazy_refcounts=off refcount_bits=16
91
+
92
+wrote 2097152/2097152 bytes at offset 0
93
+2 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
94
+
95
+{"execute": "block-commit", "arguments": {"auto-dismiss": false, "base-node": "base", "device": "top", "job-id": "job0"}}
96
+{"return": {}}
97
+{"execute": "job-complete", "arguments": {"id": "job0"}}
98
+{"return": {}}
99
+{"data": {"device": "job0", "len": 1048576, "offset": 1048576, "speed": 0, "type": "commit"}, "event": "BLOCK_JOB_READY", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
100
+{"data": {"device": "job0", "len": 1048576, "offset": 1048576, "speed": 0, "type": "commit"}, "event": "BLOCK_JOB_COMPLETED", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
101
+{"execute": "job-dismiss", "arguments": {"id": "job0"}}
102
+{"return": {}}
103
+image: TEST_IMG
104
+file format: IMGFMT
105
+virtual size: 1 MiB (1048576 bytes)
106
+cluster_size: 65536
107
+backing file: TEST_DIR/PID-base
108
+backing file format: IMGFMT
109
+Format specific information:
110
+ compat: 1.1
111
+ compression type: zlib
112
+ lazy refcounts: false
113
+ refcount bits: 16
114
+ corrupt: false
115
+ extended l2: false
116
+
117
+read 1048576/1048576 bytes at offset 0
118
+1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
119
+
120
+read 1048576/1048576 bytes at offset 1048576
121
+1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
122
+
123
== Resize tests ==
124
=== preallocation=off ===
125
Formatting 'TEST_DIR/PID-base', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=6442450944 lazy_refcounts=off refcount_bits=16
68
--
126
--
69
2.21.0
127
2.26.2
70
128
71
diff view generated by jsdifflib