1
The following changes since commit f5fe7c17ac4e309e47e78f0f9761aebc8d2f2c81:
1
The following changes since commit ac793156f650ae2d77834932d72224175ee69086:
2
2
3
Merge tag 'pull-tcg-20230823-2' of https://gitlab.com/rth7680/qemu into staging (2023-08-28 16:07:04 -0400)
3
Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20201020-1' into staging (2020-10-20 21:11:35 +0100)
4
4
5
are available in the Git repository at:
5
are available in the Git repository at:
6
6
7
https://gitlab.com/hreitz/qemu.git tags/pull-block-2023-09-01
7
https://gitlab.com/stefanha/qemu.git tags/block-pull-request
8
8
9
for you to fetch changes up to 380448464dd89291cf7fd7434be6c225482a334d:
9
for you to fetch changes up to 32a3fd65e7e3551337fd26bfc0e2f899d70c028c:
10
10
11
tests/file-io-error: New test (2023-08-29 13:01:24 +0200)
11
iotests: add commit top->base cases to 274 (2020-10-22 09:55:39 +0100)
12
12
13
----------------------------------------------------------------
13
----------------------------------------------------------------
14
Block patches
14
Pull request
15
15
16
- Fix for file-posix's zoning code crashing on I/O errors
16
v2:
17
- Throttling refactoring
17
* Fix format string issues on 32-bit hosts [Peter]
18
* Fix qemu-nbd.c CONFIG_POSIX ifdef issue [Eric]
19
* Fix missing eventfd.h header on macOS [Peter]
20
* Drop unreliable vhost-user-blk test (will send a new patch when ready) [Peter]
21
22
This pull request contains the vhost-user-blk server by Coiby Xu along with my
23
additions, block/nvme.c alignment and hardware error statistics by Philippe
24
Mathieu-Daudé, and bdrv_co_block_status_above() fixes by Vladimir
25
Sementsov-Ogievskiy.
18
26
19
----------------------------------------------------------------
27
----------------------------------------------------------------
20
Hanna Czenczek (5):
21
file-posix: Clear bs->bl.zoned on error
22
file-posix: Check bs->bl.zoned for zone info
23
file-posix: Fix zone update in I/O error path
24
file-posix: Simplify raw_co_prw's 'out' zone code
25
tests/file-io-error: New test
26
28
27
Zhenwei Pi (9):
29
Coiby Xu (6):
28
throttle: introduce enum ThrottleDirection
30
libvhost-user: Allow vu_message_read to be replaced
29
test-throttle: use enum ThrottleDirection
31
libvhost-user: remove watch for kick_fd when de-initialize vu-dev
30
throttle: support read-only and write-only
32
util/vhost-user-server: generic vhost user server
31
test-throttle: test read only and write only
33
block: move logical block size check function to a common utility
32
cryptodev: use NULL throttle timer cb for read direction
34
function
33
throttle: use enum ThrottleDirection instead of bool is_write
35
block/export: vhost-user block device backend server
34
throttle: use THROTTLE_MAX/ARRAY_SIZE for hard code
36
MAINTAINERS: Add vhost-user block device backend server maintainer
35
fsdev: Use ThrottleDirection instread of bool is_write
36
block/throttle-groups: Use ThrottleDirection instread of bool is_write
37
37
38
fsdev/qemu-fsdev-throttle.h | 4 +-
38
Philippe Mathieu-Daudé (1):
39
include/block/throttle-groups.h | 6 +-
39
block/nvme: Add driver statistics for access alignment and hw errors
40
include/qemu/throttle.h | 16 +-
40
41
backends/cryptodev.c | 12 +-
41
Stefan Hajnoczi (16):
42
block/block-backend.c | 4 +-
42
util/vhost-user-server: s/fileds/fields/ typo fix
43
block/file-posix.c | 42 +++---
43
util/vhost-user-server: drop unnecessary QOM cast
44
block/throttle-groups.c | 163 +++++++++++----------
44
util/vhost-user-server: drop unnecessary watch deletion
45
block/throttle.c | 8 +-
45
block/export: consolidate request structs into VuBlockReq
46
fsdev/qemu-fsdev-throttle.c | 18 ++-
46
util/vhost-user-server: drop unused DevicePanicNotifier
47
hw/9pfs/cofile.c | 4 +-
47
util/vhost-user-server: fix memory leak in vu_message_read()
48
tests/unit/test-throttle.c | 76 +++++++++-
48
util/vhost-user-server: check EOF when reading payload
49
util/throttle.c | 84 +++++++----
49
util/vhost-user-server: rework vu_client_trip() coroutine lifecycle
50
tests/qemu-iotests/tests/file-io-error | 119 +++++++++++++++
50
block/export: report flush errors
51
tests/qemu-iotests/tests/file-io-error.out | 33 +++++
51
block/export: convert vhost-user-blk server to block export API
52
14 files changed, 418 insertions(+), 171 deletions(-)
52
util/vhost-user-server: move header to include/
53
create mode 100755 tests/qemu-iotests/tests/file-io-error
53
util/vhost-user-server: use static library in meson.build
54
create mode 100644 tests/qemu-iotests/tests/file-io-error.out
54
qemu-storage-daemon: avoid compiling blockdev_ss twice
55
block: move block exports to libblockdev
56
block/export: add iothread and fixed-iothread options
57
block/export: add vhost-user-blk multi-queue support
58
59
Vladimir Sementsov-Ogievskiy (5):
60
block/io: fix bdrv_co_block_status_above
61
block/io: bdrv_common_block_status_above: support include_base
62
block/io: bdrv_common_block_status_above: support bs == base
63
block/io: fix bdrv_is_allocated_above
64
iotests: add commit top->base cases to 274
65
66
MAINTAINERS | 9 +
67
qapi/block-core.json | 24 +-
68
qapi/block-export.json | 36 +-
69
block/coroutines.h | 2 +
70
block/export/vhost-user-blk-server.h | 19 +
71
contrib/libvhost-user/libvhost-user.h | 21 +
72
include/qemu/vhost-user-server.h | 65 +++
73
util/block-helpers.h | 19 +
74
block/export/export.c | 37 +-
75
block/export/vhost-user-blk-server.c | 431 ++++++++++++++++++++
76
block/io.c | 132 +++---
77
block/nvme.c | 27 ++
78
block/qcow2.c | 16 +-
79
contrib/libvhost-user/libvhost-user-glib.c | 2 +-
80
contrib/libvhost-user/libvhost-user.c | 15 +-
81
hw/core/qdev-properties-system.c | 31 +-
82
nbd/server.c | 2 -
83
qemu-nbd.c | 21 +-
84
softmmu/vl.c | 4 +
85
stubs/blk-exp-close-all.c | 7 +
86
tests/vhost-user-bridge.c | 2 +
87
tools/virtiofsd/fuse_virtio.c | 4 +-
88
util/block-helpers.c | 46 +++
89
util/vhost-user-server.c | 446 +++++++++++++++++++++
90
block/export/meson.build | 3 +-
91
contrib/libvhost-user/meson.build | 1 +
92
meson.build | 22 +-
93
nbd/meson.build | 2 +
94
storage-daemon/meson.build | 3 +-
95
stubs/meson.build | 1 +
96
tests/qemu-iotests/274 | 20 +
97
tests/qemu-iotests/274.out | 68 ++++
98
util/meson.build | 4 +
99
33 files changed, 1420 insertions(+), 122 deletions(-)
100
create mode 100644 block/export/vhost-user-blk-server.h
101
create mode 100644 include/qemu/vhost-user-server.h
102
create mode 100644 util/block-helpers.h
103
create mode 100644 block/export/vhost-user-blk-server.c
104
create mode 100644 stubs/blk-exp-close-all.c
105
create mode 100644 util/block-helpers.c
106
create mode 100644 util/vhost-user-server.c
55
107
56
--
108
--
57
2.41.0
109
2.26.2
110
diff view generated by jsdifflib
New patch
1
From: Philippe Mathieu-Daudé <philmd@redhat.com>
1
2
3
Keep statistics of some hardware errors, and number of
4
aligned/unaligned I/O accesses.
5
6
QMP example booting a full RHEL 8.3 aarch64 guest:
7
8
{ "execute": "query-blockstats" }
9
{
10
"return": [
11
{
12
"device": "",
13
"node-name": "drive0",
14
"stats": {
15
"flush_total_time_ns": 6026948,
16
"wr_highest_offset": 3383991230464,
17
"wr_total_time_ns": 807450995,
18
"failed_wr_operations": 0,
19
"failed_rd_operations": 0,
20
"wr_merged": 3,
21
"wr_bytes": 50133504,
22
"failed_unmap_operations": 0,
23
"failed_flush_operations": 0,
24
"account_invalid": false,
25
"rd_total_time_ns": 1846979900,
26
"flush_operations": 130,
27
"wr_operations": 659,
28
"rd_merged": 1192,
29
"rd_bytes": 218244096,
30
"account_failed": false,
31
"idle_time_ns": 2678641497,
32
"rd_operations": 7406,
33
},
34
"driver-specific": {
35
"driver": "nvme",
36
"completion-errors": 0,
37
"unaligned-accesses": 2959,
38
"aligned-accesses": 4477
39
},
40
"qdev": "/machine/peripheral-anon/device[0]/virtio-backend"
41
}
42
]
43
}
44
45
Suggested-by: Stefan Hajnoczi <stefanha@gmail.com>
46
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
47
Acked-by: Markus Armbruster <armbru@redhat.com>
48
Message-id: 20201001162939.1567915-1-philmd@redhat.com
49
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
50
---
51
qapi/block-core.json | 24 +++++++++++++++++++++++-
52
block/nvme.c | 27 +++++++++++++++++++++++++++
53
2 files changed, 50 insertions(+), 1 deletion(-)
54
55
diff --git a/qapi/block-core.json b/qapi/block-core.json
56
index XXXXXXX..XXXXXXX 100644
57
--- a/qapi/block-core.json
58
+++ b/qapi/block-core.json
59
@@ -XXX,XX +XXX,XX @@
60
'discard-nb-failed': 'uint64',
61
'discard-bytes-ok': 'uint64' } }
62
63
+##
64
+# @BlockStatsSpecificNvme:
65
+#
66
+# NVMe driver statistics
67
+#
68
+# @completion-errors: The number of completion errors.
69
+#
70
+# @aligned-accesses: The number of aligned accesses performed by
71
+# the driver.
72
+#
73
+# @unaligned-accesses: The number of unaligned accesses performed by
74
+# the driver.
75
+#
76
+# Since: 5.2
77
+##
78
+{ 'struct': 'BlockStatsSpecificNvme',
79
+ 'data': {
80
+ 'completion-errors': 'uint64',
81
+ 'aligned-accesses': 'uint64',
82
+ 'unaligned-accesses': 'uint64' } }
83
+
84
##
85
# @BlockStatsSpecific:
86
#
87
@@ -XXX,XX +XXX,XX @@
88
'discriminator': 'driver',
89
'data': {
90
'file': 'BlockStatsSpecificFile',
91
- 'host_device': 'BlockStatsSpecificFile' } }
92
+ 'host_device': 'BlockStatsSpecificFile',
93
+ 'nvme': 'BlockStatsSpecificNvme' } }
94
95
##
96
# @BlockStats:
97
diff --git a/block/nvme.c b/block/nvme.c
98
index XXXXXXX..XXXXXXX 100644
99
--- a/block/nvme.c
100
+++ b/block/nvme.c
101
@@ -XXX,XX +XXX,XX @@ struct BDRVNVMeState {
102
103
/* PCI address (required for nvme_refresh_filename()) */
104
char *device;
105
+
106
+ struct {
107
+ uint64_t completion_errors;
108
+ uint64_t aligned_accesses;
109
+ uint64_t unaligned_accesses;
110
+ } stats;
111
};
112
113
#define NVME_BLOCK_OPT_DEVICE "device"
114
@@ -XXX,XX +XXX,XX @@ static bool nvme_process_completion(NVMeQueuePair *q)
115
break;
116
}
117
ret = nvme_translate_error(c);
118
+ if (ret) {
119
+ s->stats.completion_errors++;
120
+ }
121
q->cq.head = (q->cq.head + 1) % NVME_QUEUE_SIZE;
122
if (!q->cq.head) {
123
q->cq_phase = !q->cq_phase;
124
@@ -XXX,XX +XXX,XX @@ static int nvme_co_prw(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
125
assert(QEMU_IS_ALIGNED(bytes, s->page_size));
126
assert(bytes <= s->max_transfer);
127
if (nvme_qiov_aligned(bs, qiov)) {
128
+ s->stats.aligned_accesses++;
129
return nvme_co_prw_aligned(bs, offset, bytes, qiov, is_write, flags);
130
}
131
+ s->stats.unaligned_accesses++;
132
trace_nvme_prw_buffered(s, offset, bytes, qiov->niov, is_write);
133
buf = qemu_try_memalign(s->page_size, bytes);
134
135
@@ -XXX,XX +XXX,XX @@ static void nvme_unregister_buf(BlockDriverState *bs, void *host)
136
qemu_vfio_dma_unmap(s->vfio, host);
137
}
138
139
+static BlockStatsSpecific *nvme_get_specific_stats(BlockDriverState *bs)
140
+{
141
+ BlockStatsSpecific *stats = g_new(BlockStatsSpecific, 1);
142
+ BDRVNVMeState *s = bs->opaque;
143
+
144
+ stats->driver = BLOCKDEV_DRIVER_NVME;
145
+ stats->u.nvme = (BlockStatsSpecificNvme) {
146
+ .completion_errors = s->stats.completion_errors,
147
+ .aligned_accesses = s->stats.aligned_accesses,
148
+ .unaligned_accesses = s->stats.unaligned_accesses,
149
+ };
150
+
151
+ return stats;
152
+}
153
+
154
static const char *const nvme_strong_runtime_opts[] = {
155
NVME_BLOCK_OPT_DEVICE,
156
NVME_BLOCK_OPT_NAMESPACE,
157
@@ -XXX,XX +XXX,XX @@ static BlockDriver bdrv_nvme = {
158
.bdrv_refresh_filename = nvme_refresh_filename,
159
.bdrv_refresh_limits = nvme_refresh_limits,
160
.strong_runtime_opts = nvme_strong_runtime_opts,
161
+ .bdrv_get_specific_stats = nvme_get_specific_stats,
162
163
.bdrv_detach_aio_context = nvme_detach_aio_context,
164
.bdrv_attach_aio_context = nvme_attach_aio_context,
165
--
166
2.26.2
167
diff view generated by jsdifflib
1
From: zhenwei pi <pizhenwei@bytedance.com>
1
From: Coiby Xu <coiby.xu@gmail.com>
2
2
3
'bool is_write' style is obsolete from throttle framework, adapt
3
Allow vu_message_read to be replaced by one which will make use of the
4
block throttle groups to the new style:
4
QIOChannel functions. Thus reading vhost-user message won't stall the
5
- use ThrottleDirection instead of 'bool is_write'. Ex,
5
guest. For slave channel, we still use the default vu_message_read.
6
schedule_next_request(ThrottleGroupMember *tgm, bool is_write)
7
-> schedule_next_request(ThrottleGroupMember *tgm, ThrottleDirection direction)
8
6
9
- use THROTTLE_MAX instead of hard code. Ex, ThrottleGroupMember *tokens[2]
7
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
10
-> ThrottleGroupMember *tokens[THROTTLE_MAX]
8
Signed-off-by: Coiby Xu <coiby.xu@gmail.com>
9
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
10
Message-id: 20200918080912.321299-2-coiby.xu@gmail.com
11
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
12
---
13
contrib/libvhost-user/libvhost-user.h | 21 +++++++++++++++++++++
14
contrib/libvhost-user/libvhost-user-glib.c | 2 +-
15
contrib/libvhost-user/libvhost-user.c | 14 +++++++-------
16
tests/vhost-user-bridge.c | 2 ++
17
tools/virtiofsd/fuse_virtio.c | 4 ++--
18
5 files changed, 33 insertions(+), 10 deletions(-)
11
19
12
- use ThrottleDirection instead of hard code on iteration. Ex, (i = 0; i < 2; i++)
20
diff --git a/contrib/libvhost-user/libvhost-user.h b/contrib/libvhost-user/libvhost-user.h
13
-> for (dir = THROTTLE_READ; dir < THROTTLE_MAX; dir++)
14
15
Use a simple python script to test the new style:
16
#!/usr/bin/python3
17
import subprocess
18
import random
19
import time
20
21
commands = ['virsh blkdeviotune jammy vda --write-bytes-sec ', \
22
'virsh blkdeviotune jammy vda --write-iops-sec ', \
23
'virsh blkdeviotune jammy vda --read-bytes-sec ', \
24
'virsh blkdeviotune jammy vda --read-iops-sec ']
25
26
for loop in range(1, 1000):
27
time.sleep(random.randrange(3, 5))
28
command = commands[random.randrange(0, 3)] + str(random.randrange(0, 1000000))
29
subprocess.run(command, shell=True, check=True)
30
31
This works fine.
32
33
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
34
Message-Id: <20230728022006.1098509-10-pizhenwei@bytedance.com>
35
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
36
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
37
---
38
include/block/throttle-groups.h | 6 +-
39
block/block-backend.c | 4 +-
40
block/throttle-groups.c | 161 ++++++++++++++++----------------
41
block/throttle.c | 8 +-
42
4 files changed, 90 insertions(+), 89 deletions(-)
43
44
diff --git a/include/block/throttle-groups.h b/include/block/throttle-groups.h
45
index XXXXXXX..XXXXXXX 100644
21
index XXXXXXX..XXXXXXX 100644
46
--- a/include/block/throttle-groups.h
22
--- a/contrib/libvhost-user/libvhost-user.h
47
+++ b/include/block/throttle-groups.h
23
+++ b/contrib/libvhost-user/libvhost-user.h
48
@@ -XXX,XX +XXX,XX @@ typedef struct ThrottleGroupMember {
24
@@ -XXX,XX +XXX,XX @@
49
AioContext *aio_context;
25
*/
50
/* throttled_reqs_lock protects the CoQueues for throttled requests. */
26
#define VHOST_USER_MAX_RAM_SLOTS 32
51
CoMutex throttled_reqs_lock;
27
52
- CoQueue throttled_reqs[2];
28
+#define VHOST_USER_HDR_SIZE offsetof(VhostUserMsg, payload.u64)
53
+ CoQueue throttled_reqs[THROTTLE_MAX];
29
+
54
30
typedef enum VhostSetConfigType {
55
/* Nonzero if the I/O limits are currently being ignored; generally
31
VHOST_SET_CONFIG_TYPE_MASTER = 0,
56
* it is zero. Accessed with atomic operations.
32
VHOST_SET_CONFIG_TYPE_MIGRATION = 1,
57
@@ -XXX,XX +XXX,XX @@ typedef struct ThrottleGroupMember {
33
@@ -XXX,XX +XXX,XX @@ typedef uint64_t (*vu_get_features_cb) (VuDev *dev);
58
* throttle_state tells us if I/O limits are configured. */
34
typedef void (*vu_set_features_cb) (VuDev *dev, uint64_t features);
59
ThrottleState *throttle_state;
35
typedef int (*vu_process_msg_cb) (VuDev *dev, VhostUserMsg *vmsg,
60
ThrottleTimers throttle_timers;
36
int *do_reply);
61
- unsigned pending_reqs[2];
37
+typedef bool (*vu_read_msg_cb) (VuDev *dev, int sock, VhostUserMsg *vmsg);
62
+ unsigned pending_reqs[THROTTLE_MAX];
38
typedef void (*vu_queue_set_started_cb) (VuDev *dev, int qidx, bool started);
63
QLIST_ENTRY(ThrottleGroupMember) round_robin;
39
typedef bool (*vu_queue_is_processed_in_order_cb) (VuDev *dev, int qidx);
64
40
typedef int (*vu_get_config_cb) (VuDev *dev, uint8_t *config, uint32_t len);
65
} ThrottleGroupMember;
41
@@ -XXX,XX +XXX,XX @@ struct VuDev {
66
@@ -XXX,XX +XXX,XX @@ void throttle_group_restart_tgm(ThrottleGroupMember *tgm);
42
bool broken;
67
43
uint16_t max_queues;
68
void coroutine_fn throttle_group_co_io_limits_intercept(ThrottleGroupMember *tgm,
44
69
int64_t bytes,
45
+ /* @read_msg: custom method to read vhost-user message
70
- bool is_write);
46
+ *
71
+ ThrottleDirection direction);
47
+ * Read data from vhost_user socket fd and fill up
72
void throttle_group_attach_aio_context(ThrottleGroupMember *tgm,
48
+ * the passed VhostUserMsg *vmsg struct.
73
AioContext *new_context);
49
+ *
74
void throttle_group_detach_aio_context(ThrottleGroupMember *tgm);
50
+ * If reading fails, it should close the received set of file
75
diff --git a/block/block-backend.c b/block/block-backend.c
51
+ * descriptors as socket message's auxiliary data.
52
+ *
53
+ * For the details, please refer to vu_message_read in libvhost-user.c
54
+ * which will be used by default if not custom method is provided when
55
+ * calling vu_init
56
+ *
57
+ * Returns: true if vhost-user message successfully received,
58
+ * otherwise return false.
59
+ *
60
+ */
61
+ vu_read_msg_cb read_msg;
62
/* @set_watch: add or update the given fd to the watch set,
63
* call cb when condition is met */
64
vu_set_watch_cb set_watch;
65
@@ -XXX,XX +XXX,XX @@ bool vu_init(VuDev *dev,
66
uint16_t max_queues,
67
int socket,
68
vu_panic_cb panic,
69
+ vu_read_msg_cb read_msg,
70
vu_set_watch_cb set_watch,
71
vu_remove_watch_cb remove_watch,
72
const VuDevIface *iface);
73
diff --git a/contrib/libvhost-user/libvhost-user-glib.c b/contrib/libvhost-user/libvhost-user-glib.c
76
index XXXXXXX..XXXXXXX 100644
74
index XXXXXXX..XXXXXXX 100644
77
--- a/block/block-backend.c
75
--- a/contrib/libvhost-user/libvhost-user-glib.c
78
+++ b/block/block-backend.c
76
+++ b/contrib/libvhost-user/libvhost-user-glib.c
79
@@ -XXX,XX +XXX,XX @@ blk_co_do_preadv_part(BlockBackend *blk, int64_t offset, int64_t bytes,
77
@@ -XXX,XX +XXX,XX @@ vug_init(VugDev *dev, uint16_t max_queues, int socket,
80
/* throttling disk I/O */
78
g_assert(dev);
81
if (blk->public.throttle_group_member.throttle_state) {
79
g_assert(iface);
82
throttle_group_co_io_limits_intercept(&blk->public.throttle_group_member,
80
83
- bytes, false);
81
- if (!vu_init(&dev->parent, max_queues, socket, panic, set_watch,
84
+ bytes, THROTTLE_READ);
82
+ if (!vu_init(&dev->parent, max_queues, socket, panic, NULL, set_watch,
83
remove_watch, iface)) {
84
return false;
85
}
85
}
86
86
diff --git a/contrib/libvhost-user/libvhost-user.c b/contrib/libvhost-user/libvhost-user.c
87
ret = bdrv_co_preadv_part(blk->root, offset, bytes, qiov, qiov_offset,
87
index XXXXXXX..XXXXXXX 100644
88
@@ -XXX,XX +XXX,XX @@ blk_co_do_pwritev_part(BlockBackend *blk, int64_t offset, int64_t bytes,
88
--- a/contrib/libvhost-user/libvhost-user.c
89
/* throttling disk I/O */
89
+++ b/contrib/libvhost-user/libvhost-user.c
90
if (blk->public.throttle_group_member.throttle_state) {
90
@@ -XXX,XX +XXX,XX @@
91
throttle_group_co_io_limits_intercept(&blk->public.throttle_group_member,
91
/* The version of inflight buffer */
92
- bytes, true);
92
#define INFLIGHT_VERSION 1
93
+ bytes, THROTTLE_WRITE);
93
94
-#define VHOST_USER_HDR_SIZE offsetof(VhostUserMsg, payload.u64)
95
-
96
/* The version of the protocol we support */
97
#define VHOST_USER_VERSION 1
98
#define LIBVHOST_USER_DEBUG 0
99
@@ -XXX,XX +XXX,XX @@ have_userfault(void)
100
}
101
102
static bool
103
-vu_message_read(VuDev *dev, int conn_fd, VhostUserMsg *vmsg)
104
+vu_message_read_default(VuDev *dev, int conn_fd, VhostUserMsg *vmsg)
105
{
106
char control[CMSG_SPACE(VHOST_MEMORY_BASELINE_NREGIONS * sizeof(int))] = {};
107
struct iovec iov = {
108
@@ -XXX,XX +XXX,XX @@ vu_process_message_reply(VuDev *dev, const VhostUserMsg *vmsg)
109
goto out;
94
}
110
}
95
111
96
if (!blk->enable_write_cache) {
112
- if (!vu_message_read(dev, dev->slave_fd, &msg_reply)) {
97
diff --git a/block/throttle-groups.c b/block/throttle-groups.c
113
+ if (!vu_message_read_default(dev, dev->slave_fd, &msg_reply)) {
98
index XXXXXXX..XXXXXXX 100644
114
goto out;
99
--- a/block/throttle-groups.c
100
+++ b/block/throttle-groups.c
101
@@ -XXX,XX +XXX,XX @@
102
103
static void throttle_group_obj_init(Object *obj);
104
static void throttle_group_obj_complete(UserCreatable *obj, Error **errp);
105
-static void timer_cb(ThrottleGroupMember *tgm, bool is_write);
106
+static void timer_cb(ThrottleGroupMember *tgm, ThrottleDirection direction);
107
108
/* The ThrottleGroup structure (with its ThrottleState) is shared
109
* among different ThrottleGroupMembers and it's independent from
110
@@ -XXX,XX +XXX,XX @@ struct ThrottleGroup {
111
QemuMutex lock; /* This lock protects the following four fields */
112
ThrottleState ts;
113
QLIST_HEAD(, ThrottleGroupMember) head;
114
- ThrottleGroupMember *tokens[2];
115
- bool any_timer_armed[2];
116
+ ThrottleGroupMember *tokens[THROTTLE_MAX];
117
+ bool any_timer_armed[THROTTLE_MAX];
118
QEMUClockType clock_type;
119
120
/* This field is protected by the global QEMU mutex */
121
@@ -XXX,XX +XXX,XX @@ static ThrottleGroupMember *throttle_group_next_tgm(ThrottleGroupMember *tgm)
122
* This assumes that tg->lock is held.
123
*
124
* @tgm: the ThrottleGroupMember
125
- * @is_write: the type of operation (read/write)
126
+ * @direction: the ThrottleDirection
127
* @ret: whether the ThrottleGroupMember has pending requests.
128
*/
129
static inline bool tgm_has_pending_reqs(ThrottleGroupMember *tgm,
130
- bool is_write)
131
+ ThrottleDirection direction)
132
{
133
- return tgm->pending_reqs[is_write];
134
+ return tgm->pending_reqs[direction];
135
}
136
137
/* Return the next ThrottleGroupMember in the round-robin sequence with pending
138
@@ -XXX,XX +XXX,XX @@ static inline bool tgm_has_pending_reqs(ThrottleGroupMember *tgm,
139
* This assumes that tg->lock is held.
140
*
141
* @tgm: the current ThrottleGroupMember
142
- * @is_write: the type of operation (read/write)
143
+ * @direction: the ThrottleDirection
144
* @ret: the next ThrottleGroupMember with pending requests, or tgm if
145
* there is none.
146
*/
147
static ThrottleGroupMember *next_throttle_token(ThrottleGroupMember *tgm,
148
- bool is_write)
149
+ ThrottleDirection direction)
150
{
151
ThrottleState *ts = tgm->throttle_state;
152
ThrottleGroup *tg = container_of(ts, ThrottleGroup, ts);
153
@@ -XXX,XX +XXX,XX @@ static ThrottleGroupMember *next_throttle_token(ThrottleGroupMember *tgm,
154
* it's being drained. Skip the round-robin search and return tgm
155
* immediately if it has pending requests. Otherwise we could be
156
* forcing it to wait for other member's throttled requests. */
157
- if (tgm_has_pending_reqs(tgm, is_write) &&
158
+ if (tgm_has_pending_reqs(tgm, direction) &&
159
qatomic_read(&tgm->io_limits_disabled)) {
160
return tgm;
161
}
115
}
162
116
163
- start = token = tg->tokens[is_write];
117
@@ -XXX,XX +XXX,XX @@ vu_set_mem_table_exec_postcopy(VuDev *dev, VhostUserMsg *vmsg)
164
+ start = token = tg->tokens[direction];
118
/* Wait for QEMU to confirm that it's registered the handler for the
165
119
* faults.
166
/* get next bs round in round robin style */
120
*/
167
token = throttle_group_next_tgm(token);
121
- if (!vu_message_read(dev, dev->sock, vmsg) ||
168
- while (token != start && !tgm_has_pending_reqs(token, is_write)) {
122
+ if (!dev->read_msg(dev, dev->sock, vmsg) ||
169
+ while (token != start && !tgm_has_pending_reqs(token, direction)) {
123
vmsg->size != sizeof(vmsg->payload.u64) ||
170
token = throttle_group_next_tgm(token);
124
vmsg->payload.u64 != 0) {
125
vu_panic(dev, "failed to receive valid ack for postcopy set-mem-table");
126
@@ -XXX,XX +XXX,XX @@ vu_dispatch(VuDev *dev)
127
int reply_requested;
128
bool need_reply, success = false;
129
130
- if (!vu_message_read(dev, dev->sock, &vmsg)) {
131
+ if (!dev->read_msg(dev, dev->sock, &vmsg)) {
132
goto end;
171
}
133
}
172
134
173
@@ -XXX,XX +XXX,XX @@ static ThrottleGroupMember *next_throttle_token(ThrottleGroupMember *tgm,
135
@@ -XXX,XX +XXX,XX @@ vu_init(VuDev *dev,
174
* then decide the token is the current tgm because chances are
136
uint16_t max_queues,
175
* the current tgm got the current request queued.
137
int socket,
176
*/
138
vu_panic_cb panic,
177
- if (token == start && !tgm_has_pending_reqs(token, is_write)) {
139
+ vu_read_msg_cb read_msg,
178
+ if (token == start && !tgm_has_pending_reqs(token, direction)) {
140
vu_set_watch_cb set_watch,
179
token = tgm;
141
vu_remove_watch_cb remove_watch,
180
}
142
const VuDevIface *iface)
181
143
@@ -XXX,XX +XXX,XX @@ vu_init(VuDev *dev,
182
/* Either we return the original TGM, or one with pending requests */
144
183
- assert(token == tgm || tgm_has_pending_reqs(token, is_write));
145
dev->sock = socket;
184
+ assert(token == tgm || tgm_has_pending_reqs(token, direction));
146
dev->panic = panic;
185
147
+ dev->read_msg = read_msg ? read_msg : vu_message_read_default;
186
return token;
148
dev->set_watch = set_watch;
187
}
149
dev->remove_watch = remove_watch;
188
@@ -XXX,XX +XXX,XX @@ static ThrottleGroupMember *next_throttle_token(ThrottleGroupMember *tgm,
150
dev->iface = iface;
189
* This assumes that tg->lock is held.
151
@@ -XXX,XX +XXX,XX @@ static void _vu_queue_notify(VuDev *dev, VuVirtq *vq, bool sync)
190
*
152
191
* @tgm: the current ThrottleGroupMember
153
vu_message_write(dev, dev->slave_fd, &vmsg);
192
- * @is_write: the type of operation (read/write)
154
if (ack) {
193
+ * @direction: the ThrottleDirection
155
- vu_message_read(dev, dev->slave_fd, &vmsg);
194
* @ret: whether the I/O request needs to be throttled or not
156
+ vu_message_read_default(dev, dev->slave_fd, &vmsg);
195
*/
157
}
196
static bool throttle_group_schedule_timer(ThrottleGroupMember *tgm,
197
- bool is_write)
198
+ ThrottleDirection direction)
199
{
200
ThrottleState *ts = tgm->throttle_state;
201
ThrottleGroup *tg = container_of(ts, ThrottleGroup, ts);
202
ThrottleTimers *tt = &tgm->throttle_timers;
203
- ThrottleDirection direction = is_write ? THROTTLE_WRITE : THROTTLE_READ;
204
bool must_wait;
205
206
if (qatomic_read(&tgm->io_limits_disabled)) {
207
@@ -XXX,XX +XXX,XX @@ static bool throttle_group_schedule_timer(ThrottleGroupMember *tgm,
208
}
209
210
/* Check if any of the timers in this group is already armed */
211
- if (tg->any_timer_armed[is_write]) {
212
+ if (tg->any_timer_armed[direction]) {
213
return true;
214
}
215
216
@@ -XXX,XX +XXX,XX @@ static bool throttle_group_schedule_timer(ThrottleGroupMember *tgm,
217
218
/* If a timer just got armed, set tgm as the current token */
219
if (must_wait) {
220
- tg->tokens[is_write] = tgm;
221
- tg->any_timer_armed[is_write] = true;
222
+ tg->tokens[direction] = tgm;
223
+ tg->any_timer_armed[direction] = true;
224
}
225
226
return must_wait;
227
@@ -XXX,XX +XXX,XX @@ static bool throttle_group_schedule_timer(ThrottleGroupMember *tgm,
228
* any request was actually pending.
229
*
230
* @tgm: the current ThrottleGroupMember
231
- * @is_write: the type of operation (read/write)
232
+ * @direction: the ThrottleDirection
233
*/
234
static bool coroutine_fn throttle_group_co_restart_queue(ThrottleGroupMember *tgm,
235
- bool is_write)
236
+ ThrottleDirection direction)
237
{
238
bool ret;
239
240
qemu_co_mutex_lock(&tgm->throttled_reqs_lock);
241
- ret = qemu_co_queue_next(&tgm->throttled_reqs[is_write]);
242
+ ret = qemu_co_queue_next(&tgm->throttled_reqs[direction]);
243
qemu_co_mutex_unlock(&tgm->throttled_reqs_lock);
244
245
return ret;
246
@@ -XXX,XX +XXX,XX @@ static bool coroutine_fn throttle_group_co_restart_queue(ThrottleGroupMember *tg
247
* This assumes that tg->lock is held.
248
*
249
* @tgm: the current ThrottleGroupMember
250
- * @is_write: the type of operation (read/write)
251
+ * @direction: the ThrottleDirection
252
*/
253
-static void schedule_next_request(ThrottleGroupMember *tgm, bool is_write)
254
+static void schedule_next_request(ThrottleGroupMember *tgm,
255
+ ThrottleDirection direction)
256
{
257
ThrottleState *ts = tgm->throttle_state;
258
ThrottleGroup *tg = container_of(ts, ThrottleGroup, ts);
259
@@ -XXX,XX +XXX,XX @@ static void schedule_next_request(ThrottleGroupMember *tgm, bool is_write)
260
ThrottleGroupMember *token;
261
262
/* Check if there's any pending request to schedule next */
263
- token = next_throttle_token(tgm, is_write);
264
- if (!tgm_has_pending_reqs(token, is_write)) {
265
+ token = next_throttle_token(tgm, direction);
266
+ if (!tgm_has_pending_reqs(token, direction)) {
267
return;
158
return;
268
}
159
}
269
160
diff --git a/tests/vhost-user-bridge.c b/tests/vhost-user-bridge.c
270
/* Set a timer for the request if it needs to be throttled */
271
- must_wait = throttle_group_schedule_timer(token, is_write);
272
+ must_wait = throttle_group_schedule_timer(token, direction);
273
274
/* If it doesn't have to wait, queue it for immediate execution */
275
if (!must_wait) {
276
/* Give preference to requests from the current tgm */
277
if (qemu_in_coroutine() &&
278
- throttle_group_co_restart_queue(tgm, is_write)) {
279
+ throttle_group_co_restart_queue(tgm, direction)) {
280
token = tgm;
281
} else {
282
ThrottleTimers *tt = &token->throttle_timers;
283
int64_t now = qemu_clock_get_ns(tg->clock_type);
284
- timer_mod(tt->timers[is_write], now);
285
- tg->any_timer_armed[is_write] = true;
286
+ timer_mod(tt->timers[direction], now);
287
+ tg->any_timer_armed[direction] = true;
288
}
289
- tg->tokens[is_write] = token;
290
+ tg->tokens[direction] = token;
291
}
292
}
293
294
@@ -XXX,XX +XXX,XX @@ static void schedule_next_request(ThrottleGroupMember *tgm, bool is_write)
295
*
296
* @tgm: the current ThrottleGroupMember
297
* @bytes: the number of bytes for this I/O
298
- * @is_write: the type of operation (read/write)
299
+ * @direction: the ThrottleDirection
300
*/
301
void coroutine_fn throttle_group_co_io_limits_intercept(ThrottleGroupMember *tgm,
302
int64_t bytes,
303
- bool is_write)
304
+ ThrottleDirection direction)
305
{
306
bool must_wait;
307
ThrottleGroupMember *token;
308
ThrottleGroup *tg = container_of(tgm->throttle_state, ThrottleGroup, ts);
309
- ThrottleDirection direction = is_write ? THROTTLE_WRITE : THROTTLE_READ;
310
311
assert(bytes >= 0);
312
+ assert(direction < THROTTLE_MAX);
313
314
qemu_mutex_lock(&tg->lock);
315
316
/* First we check if this I/O has to be throttled. */
317
- token = next_throttle_token(tgm, is_write);
318
- must_wait = throttle_group_schedule_timer(token, is_write);
319
+ token = next_throttle_token(tgm, direction);
320
+ must_wait = throttle_group_schedule_timer(token, direction);
321
322
/* Wait if there's a timer set or queued requests of this type */
323
- if (must_wait || tgm->pending_reqs[is_write]) {
324
- tgm->pending_reqs[is_write]++;
325
+ if (must_wait || tgm->pending_reqs[direction]) {
326
+ tgm->pending_reqs[direction]++;
327
qemu_mutex_unlock(&tg->lock);
328
qemu_co_mutex_lock(&tgm->throttled_reqs_lock);
329
- qemu_co_queue_wait(&tgm->throttled_reqs[is_write],
330
+ qemu_co_queue_wait(&tgm->throttled_reqs[direction],
331
&tgm->throttled_reqs_lock);
332
qemu_co_mutex_unlock(&tgm->throttled_reqs_lock);
333
qemu_mutex_lock(&tg->lock);
334
- tgm->pending_reqs[is_write]--;
335
+ tgm->pending_reqs[direction]--;
336
}
337
338
/* The I/O will be executed, so do the accounting */
339
throttle_account(tgm->throttle_state, direction, bytes);
340
341
/* Schedule the next request */
342
- schedule_next_request(tgm, is_write);
343
+ schedule_next_request(tgm, direction);
344
345
qemu_mutex_unlock(&tg->lock);
346
}
347
348
typedef struct {
349
ThrottleGroupMember *tgm;
350
- bool is_write;
351
+ ThrottleDirection direction;
352
} RestartData;
353
354
static void coroutine_fn throttle_group_restart_queue_entry(void *opaque)
355
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn throttle_group_restart_queue_entry(void *opaque)
356
ThrottleGroupMember *tgm = data->tgm;
357
ThrottleState *ts = tgm->throttle_state;
358
ThrottleGroup *tg = container_of(ts, ThrottleGroup, ts);
359
- bool is_write = data->is_write;
360
+ ThrottleDirection direction = data->direction;
361
bool empty_queue;
362
363
- empty_queue = !throttle_group_co_restart_queue(tgm, is_write);
364
+ empty_queue = !throttle_group_co_restart_queue(tgm, direction);
365
366
/* If the request queue was empty then we have to take care of
367
* scheduling the next one */
368
if (empty_queue) {
369
qemu_mutex_lock(&tg->lock);
370
- schedule_next_request(tgm, is_write);
371
+ schedule_next_request(tgm, direction);
372
qemu_mutex_unlock(&tg->lock);
373
}
374
375
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn throttle_group_restart_queue_entry(void *opaque)
376
aio_wait_kick();
377
}
378
379
-static void throttle_group_restart_queue(ThrottleGroupMember *tgm, bool is_write)
380
+static void throttle_group_restart_queue(ThrottleGroupMember *tgm,
381
+ ThrottleDirection direction)
382
{
383
Coroutine *co;
384
RestartData *rd = g_new0(RestartData, 1);
385
386
rd->tgm = tgm;
387
- rd->is_write = is_write;
388
+ rd->direction = direction;
389
390
/* This function is called when a timer is fired or when
391
* throttle_group_restart_tgm() is called. Either way, there can
392
* be no timer pending on this tgm at this point */
393
- assert(!timer_pending(tgm->throttle_timers.timers[is_write]));
394
+ assert(!timer_pending(tgm->throttle_timers.timers[direction]));
395
396
qatomic_inc(&tgm->restart_pending);
397
398
@@ -XXX,XX +XXX,XX @@ static void throttle_group_restart_queue(ThrottleGroupMember *tgm, bool is_write
399
400
void throttle_group_restart_tgm(ThrottleGroupMember *tgm)
401
{
402
- int i;
403
+ ThrottleDirection dir;
404
405
if (tgm->throttle_state) {
406
- for (i = 0; i < 2; i++) {
407
- QEMUTimer *t = tgm->throttle_timers.timers[i];
408
+ for (dir = THROTTLE_READ; dir < THROTTLE_MAX; dir++) {
409
+ QEMUTimer *t = tgm->throttle_timers.timers[dir];
410
if (timer_pending(t)) {
411
/* If there's a pending timer on this tgm, fire it now */
412
timer_del(t);
413
- timer_cb(tgm, i);
414
+ timer_cb(tgm, dir);
415
} else {
416
/* Else run the next request from the queue manually */
417
- throttle_group_restart_queue(tgm, i);
418
+ throttle_group_restart_queue(tgm, dir);
419
}
420
}
421
}
422
@@ -XXX,XX +XXX,XX @@ void throttle_group_get_config(ThrottleGroupMember *tgm, ThrottleConfig *cfg)
423
* because it had been throttled.
424
*
425
* @tgm: the ThrottleGroupMember whose request had been throttled
426
- * @is_write: the type of operation (read/write)
427
+ * @direction: the ThrottleDirection
428
*/
429
-static void timer_cb(ThrottleGroupMember *tgm, bool is_write)
430
+static void timer_cb(ThrottleGroupMember *tgm, ThrottleDirection direction)
431
{
432
ThrottleState *ts = tgm->throttle_state;
433
ThrottleGroup *tg = container_of(ts, ThrottleGroup, ts);
434
435
/* The timer has just been fired, so we can update the flag */
436
qemu_mutex_lock(&tg->lock);
437
- tg->any_timer_armed[is_write] = false;
438
+ tg->any_timer_armed[direction] = false;
439
qemu_mutex_unlock(&tg->lock);
440
441
/* Run the request that was waiting for this timer */
442
- throttle_group_restart_queue(tgm, is_write);
443
+ throttle_group_restart_queue(tgm, direction);
444
}
445
446
static void read_timer_cb(void *opaque)
447
{
448
- timer_cb(opaque, false);
449
+ timer_cb(opaque, THROTTLE_READ);
450
}
451
452
static void write_timer_cb(void *opaque)
453
{
454
- timer_cb(opaque, true);
455
+ timer_cb(opaque, THROTTLE_WRITE);
456
}
457
458
/* Register a ThrottleGroupMember from the throttling group, also initializing
459
@@ -XXX,XX +XXX,XX @@ void throttle_group_register_tgm(ThrottleGroupMember *tgm,
460
const char *groupname,
461
AioContext *ctx)
462
{
463
- int i;
464
+ ThrottleDirection dir;
465
ThrottleState *ts = throttle_group_incref(groupname);
466
ThrottleGroup *tg = container_of(ts, ThrottleGroup, ts);
467
468
@@ -XXX,XX +XXX,XX @@ void throttle_group_register_tgm(ThrottleGroupMember *tgm,
469
470
QEMU_LOCK_GUARD(&tg->lock);
471
/* If the ThrottleGroup is new set this ThrottleGroupMember as the token */
472
- for (i = 0; i < 2; i++) {
473
- if (!tg->tokens[i]) {
474
- tg->tokens[i] = tgm;
475
+ for (dir = THROTTLE_READ; dir < THROTTLE_MAX; dir++) {
476
+ if (!tg->tokens[dir]) {
477
+ tg->tokens[dir] = tgm;
478
}
479
+ qemu_co_queue_init(&tgm->throttled_reqs[dir]);
480
}
481
482
QLIST_INSERT_HEAD(&tg->head, tgm, round_robin);
483
@@ -XXX,XX +XXX,XX @@ void throttle_group_register_tgm(ThrottleGroupMember *tgm,
484
write_timer_cb,
485
tgm);
486
qemu_co_mutex_init(&tgm->throttled_reqs_lock);
487
- qemu_co_queue_init(&tgm->throttled_reqs[0]);
488
- qemu_co_queue_init(&tgm->throttled_reqs[1]);
489
}
490
491
/* Unregister a ThrottleGroupMember from its group, removing it from the list,
492
@@ -XXX,XX +XXX,XX @@ void throttle_group_unregister_tgm(ThrottleGroupMember *tgm)
493
ThrottleState *ts = tgm->throttle_state;
494
ThrottleGroup *tg = container_of(ts, ThrottleGroup, ts);
495
ThrottleGroupMember *token;
496
- int i;
497
+ ThrottleDirection dir;
498
499
if (!ts) {
500
/* Discard already unregistered tgm */
501
@@ -XXX,XX +XXX,XX @@ void throttle_group_unregister_tgm(ThrottleGroupMember *tgm)
502
AIO_WAIT_WHILE(tgm->aio_context, qatomic_read(&tgm->restart_pending) > 0);
503
504
WITH_QEMU_LOCK_GUARD(&tg->lock) {
505
- for (i = 0; i < 2; i++) {
506
- assert(tgm->pending_reqs[i] == 0);
507
- assert(qemu_co_queue_empty(&tgm->throttled_reqs[i]));
508
- assert(!timer_pending(tgm->throttle_timers.timers[i]));
509
- if (tg->tokens[i] == tgm) {
510
+ for (dir = THROTTLE_READ; dir < THROTTLE_MAX; dir++) {
511
+ assert(tgm->pending_reqs[dir] == 0);
512
+ assert(qemu_co_queue_empty(&tgm->throttled_reqs[dir]));
513
+ assert(!timer_pending(tgm->throttle_timers.timers[dir]));
514
+ if (tg->tokens[dir] == tgm) {
515
token = throttle_group_next_tgm(tgm);
516
/* Take care of the case where this is the last tgm in the group */
517
if (token == tgm) {
518
token = NULL;
519
}
520
- tg->tokens[i] = token;
521
+ tg->tokens[dir] = token;
522
}
523
}
524
525
@@ -XXX,XX +XXX,XX @@ void throttle_group_detach_aio_context(ThrottleGroupMember *tgm)
526
{
527
ThrottleGroup *tg = container_of(tgm->throttle_state, ThrottleGroup, ts);
528
ThrottleTimers *tt = &tgm->throttle_timers;
529
- int i;
530
+ ThrottleDirection dir;
531
532
/* Requests must have been drained */
533
- assert(tgm->pending_reqs[0] == 0 && tgm->pending_reqs[1] == 0);
534
- assert(qemu_co_queue_empty(&tgm->throttled_reqs[0]));
535
- assert(qemu_co_queue_empty(&tgm->throttled_reqs[1]));
536
+ for (dir = THROTTLE_READ; dir < THROTTLE_MAX; dir++) {
537
+ assert(tgm->pending_reqs[dir] == 0);
538
+ assert(qemu_co_queue_empty(&tgm->throttled_reqs[dir]));
539
+ }
540
541
/* Kick off next ThrottleGroupMember, if necessary */
542
WITH_QEMU_LOCK_GUARD(&tg->lock) {
543
- for (i = 0; i < 2; i++) {
544
- if (timer_pending(tt->timers[i])) {
545
- tg->any_timer_armed[i] = false;
546
- schedule_next_request(tgm, i);
547
+ for (dir = THROTTLE_READ; dir < THROTTLE_MAX; dir++) {
548
+ if (timer_pending(tt->timers[dir])) {
549
+ tg->any_timer_armed[dir] = false;
550
+ schedule_next_request(tgm, dir);
551
}
552
}
553
}
554
diff --git a/block/throttle.c b/block/throttle.c
555
index XXXXXXX..XXXXXXX 100644
161
index XXXXXXX..XXXXXXX 100644
556
--- a/block/throttle.c
162
--- a/tests/vhost-user-bridge.c
557
+++ b/block/throttle.c
163
+++ b/tests/vhost-user-bridge.c
558
@@ -XXX,XX +XXX,XX @@ throttle_co_preadv(BlockDriverState *bs, int64_t offset, int64_t bytes,
164
@@ -XXX,XX +XXX,XX @@ vubr_accept_cb(int sock, void *ctx)
559
{
165
VHOST_USER_BRIDGE_MAX_QUEUES,
560
166
conn_fd,
561
ThrottleGroupMember *tgm = bs->opaque;
167
vubr_panic,
562
- throttle_group_co_io_limits_intercept(tgm, bytes, false);
168
+ NULL,
563
+ throttle_group_co_io_limits_intercept(tgm, bytes, THROTTLE_READ);
169
vubr_set_watch,
564
170
vubr_remove_watch,
565
return bdrv_co_preadv(bs->file, offset, bytes, qiov, flags);
171
&vuiface)) {
566
}
172
@@ -XXX,XX +XXX,XX @@ vubr_new(const char *path, bool client)
567
@@ -XXX,XX +XXX,XX @@ throttle_co_pwritev(BlockDriverState *bs, int64_t offset, int64_t bytes,
173
VHOST_USER_BRIDGE_MAX_QUEUES,
568
QEMUIOVector *qiov, BdrvRequestFlags flags)
174
dev->sock,
569
{
175
vubr_panic,
570
ThrottleGroupMember *tgm = bs->opaque;
176
+ NULL,
571
- throttle_group_co_io_limits_intercept(tgm, bytes, true);
177
vubr_set_watch,
572
+ throttle_group_co_io_limits_intercept(tgm, bytes, THROTTLE_WRITE);
178
vubr_remove_watch,
573
179
&vuiface)) {
574
return bdrv_co_pwritev(bs->file, offset, bytes, qiov, flags);
180
diff --git a/tools/virtiofsd/fuse_virtio.c b/tools/virtiofsd/fuse_virtio.c
575
}
181
index XXXXXXX..XXXXXXX 100644
576
@@ -XXX,XX +XXX,XX @@ throttle_co_pwrite_zeroes(BlockDriverState *bs, int64_t offset, int64_t bytes,
182
--- a/tools/virtiofsd/fuse_virtio.c
577
BdrvRequestFlags flags)
183
+++ b/tools/virtiofsd/fuse_virtio.c
578
{
184
@@ -XXX,XX +XXX,XX @@ int virtio_session_mount(struct fuse_session *se)
579
ThrottleGroupMember *tgm = bs->opaque;
185
se->vu_socketfd = data_sock;
580
- throttle_group_co_io_limits_intercept(tgm, bytes, true);
186
se->virtio_dev->se = se;
581
+ throttle_group_co_io_limits_intercept(tgm, bytes, THROTTLE_WRITE);
187
pthread_rwlock_init(&se->virtio_dev->vu_dispatch_rwlock, NULL);
582
188
- vu_init(&se->virtio_dev->dev, 2, se->vu_socketfd, fv_panic, fv_set_watch,
583
return bdrv_co_pwrite_zeroes(bs->file, offset, bytes, flags);
189
- fv_remove_watch, &fv_iface);
584
}
190
+ vu_init(&se->virtio_dev->dev, 2, se->vu_socketfd, fv_panic, NULL,
585
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn GRAPH_RDLOCK
191
+ fv_set_watch, fv_remove_watch, &fv_iface);
586
throttle_co_pdiscard(BlockDriverState *bs, int64_t offset, int64_t bytes)
192
587
{
193
return 0;
588
ThrottleGroupMember *tgm = bs->opaque;
589
- throttle_group_co_io_limits_intercept(tgm, bytes, true);
590
+ throttle_group_co_io_limits_intercept(tgm, bytes, THROTTLE_WRITE);
591
592
return bdrv_co_pdiscard(bs->file, offset, bytes);
593
}
194
}
594
--
195
--
595
2.41.0
196
2.26.2
197
diff view generated by jsdifflib
New patch
1
From: Coiby Xu <coiby.xu@gmail.com>
1
2
3
When the client is running in gdb and quit command is run in gdb,
4
QEMU will still dispatch the event which will cause segment fault in
5
the callback function.
6
7
Signed-off-by: Coiby Xu <coiby.xu@gmail.com>
8
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
9
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
10
Message-id: 20200918080912.321299-3-coiby.xu@gmail.com
11
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
12
---
13
contrib/libvhost-user/libvhost-user.c | 1 +
14
1 file changed, 1 insertion(+)
15
16
diff --git a/contrib/libvhost-user/libvhost-user.c b/contrib/libvhost-user/libvhost-user.c
17
index XXXXXXX..XXXXXXX 100644
18
--- a/contrib/libvhost-user/libvhost-user.c
19
+++ b/contrib/libvhost-user/libvhost-user.c
20
@@ -XXX,XX +XXX,XX @@ vu_deinit(VuDev *dev)
21
}
22
23
if (vq->kick_fd != -1) {
24
+ dev->remove_watch(dev, vq->kick_fd);
25
close(vq->kick_fd);
26
vq->kick_fd = -1;
27
}
28
--
29
2.26.2
30
diff view generated by jsdifflib
New patch
1
From: Coiby Xu <coiby.xu@gmail.com>
1
2
3
Sharing QEMU devices via vhost-user protocol.
4
5
Only one vhost-user client can connect to the server one time.
6
7
Suggested-by: Kevin Wolf <kwolf@redhat.com>
8
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
9
Signed-off-by: Coiby Xu <coiby.xu@gmail.com>
10
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
11
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
12
Message-id: 20200918080912.321299-4-coiby.xu@gmail.com
13
[Fixed size_t %lu -> %zu format string compiler error.
14
--Stefan]
15
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
16
---
17
util/vhost-user-server.h | 65 ++++++
18
util/vhost-user-server.c | 428 +++++++++++++++++++++++++++++++++++++++
19
util/meson.build | 1 +
20
3 files changed, 494 insertions(+)
21
create mode 100644 util/vhost-user-server.h
22
create mode 100644 util/vhost-user-server.c
23
24
diff --git a/util/vhost-user-server.h b/util/vhost-user-server.h
25
new file mode 100644
26
index XXXXXXX..XXXXXXX
27
--- /dev/null
28
+++ b/util/vhost-user-server.h
29
@@ -XXX,XX +XXX,XX @@
30
+/*
31
+ * Sharing QEMU devices via vhost-user protocol
32
+ *
33
+ * Copyright (c) Coiby Xu <coiby.xu@gmail.com>.
34
+ * Copyright (c) 2020 Red Hat, Inc.
35
+ *
36
+ * This work is licensed under the terms of the GNU GPL, version 2 or
37
+ * later. See the COPYING file in the top-level directory.
38
+ */
39
+
40
+#ifndef VHOST_USER_SERVER_H
41
+#define VHOST_USER_SERVER_H
42
+
43
+#include "contrib/libvhost-user/libvhost-user.h"
44
+#include "io/channel-socket.h"
45
+#include "io/channel-file.h"
46
+#include "io/net-listener.h"
47
+#include "qemu/error-report.h"
48
+#include "qapi/error.h"
49
+#include "standard-headers/linux/virtio_blk.h"
50
+
51
+typedef struct VuFdWatch {
52
+ VuDev *vu_dev;
53
+ int fd; /*kick fd*/
54
+ void *pvt;
55
+ vu_watch_cb cb;
56
+ bool processing;
57
+ QTAILQ_ENTRY(VuFdWatch) next;
58
+} VuFdWatch;
59
+
60
+typedef struct VuServer VuServer;
61
+typedef void DevicePanicNotifierFn(VuServer *server);
62
+
63
+struct VuServer {
64
+ QIONetListener *listener;
65
+ AioContext *ctx;
66
+ DevicePanicNotifierFn *device_panic_notifier;
67
+ int max_queues;
68
+ const VuDevIface *vu_iface;
69
+ VuDev vu_dev;
70
+ QIOChannel *ioc; /* The I/O channel with the client */
71
+ QIOChannelSocket *sioc; /* The underlying data channel with the client */
72
+ /* IOChannel for fd provided via VHOST_USER_SET_SLAVE_REQ_FD */
73
+ QIOChannel *ioc_slave;
74
+ QIOChannelSocket *sioc_slave;
75
+ Coroutine *co_trip; /* coroutine for processing VhostUserMsg */
76
+ QTAILQ_HEAD(, VuFdWatch) vu_fd_watches;
77
+ /* restart coroutine co_trip if AIOContext is changed */
78
+ bool aio_context_changed;
79
+ bool processing_msg;
80
+};
81
+
82
+bool vhost_user_server_start(VuServer *server,
83
+ SocketAddress *unix_socket,
84
+ AioContext *ctx,
85
+ uint16_t max_queues,
86
+ DevicePanicNotifierFn *device_panic_notifier,
87
+ const VuDevIface *vu_iface,
88
+ Error **errp);
89
+
90
+void vhost_user_server_stop(VuServer *server);
91
+
92
+void vhost_user_server_set_aio_context(VuServer *server, AioContext *ctx);
93
+
94
+#endif /* VHOST_USER_SERVER_H */
95
diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
96
new file mode 100644
97
index XXXXXXX..XXXXXXX
98
--- /dev/null
99
+++ b/util/vhost-user-server.c
100
@@ -XXX,XX +XXX,XX @@
101
+/*
102
+ * Sharing QEMU devices via vhost-user protocol
103
+ *
104
+ * Copyright (c) Coiby Xu <coiby.xu@gmail.com>.
105
+ * Copyright (c) 2020 Red Hat, Inc.
106
+ *
107
+ * This work is licensed under the terms of the GNU GPL, version 2 or
108
+ * later. See the COPYING file in the top-level directory.
109
+ */
110
+#include "qemu/osdep.h"
111
+#include "qemu/main-loop.h"
112
+#include "vhost-user-server.h"
113
+
114
+static void vmsg_close_fds(VhostUserMsg *vmsg)
115
+{
116
+ int i;
117
+ for (i = 0; i < vmsg->fd_num; i++) {
118
+ close(vmsg->fds[i]);
119
+ }
120
+}
121
+
122
+static void vmsg_unblock_fds(VhostUserMsg *vmsg)
123
+{
124
+ int i;
125
+ for (i = 0; i < vmsg->fd_num; i++) {
126
+ qemu_set_nonblock(vmsg->fds[i]);
127
+ }
128
+}
129
+
130
+static void vu_accept(QIONetListener *listener, QIOChannelSocket *sioc,
131
+ gpointer opaque);
132
+
133
+static void close_client(VuServer *server)
134
+{
135
+ /*
136
+ * Before closing the client
137
+ *
138
+ * 1. Let vu_client_trip stop processing new vhost-user msg
139
+ *
140
+ * 2. remove kick_handler
141
+ *
142
+ * 3. wait for the kick handler to be finished
143
+ *
144
+ * 4. wait for the current vhost-user msg to be finished processing
145
+ */
146
+
147
+ QIOChannelSocket *sioc = server->sioc;
148
+ /* When this is set vu_client_trip will stop new processing vhost-user message */
149
+ server->sioc = NULL;
150
+
151
+ VuFdWatch *vu_fd_watch, *next;
152
+ QTAILQ_FOREACH_SAFE(vu_fd_watch, &server->vu_fd_watches, next, next) {
153
+ aio_set_fd_handler(server->ioc->ctx, vu_fd_watch->fd, true, NULL,
154
+ NULL, NULL, NULL);
155
+ }
156
+
157
+ while (!QTAILQ_EMPTY(&server->vu_fd_watches)) {
158
+ QTAILQ_FOREACH_SAFE(vu_fd_watch, &server->vu_fd_watches, next, next) {
159
+ if (!vu_fd_watch->processing) {
160
+ QTAILQ_REMOVE(&server->vu_fd_watches, vu_fd_watch, next);
161
+ g_free(vu_fd_watch);
162
+ }
163
+ }
164
+ }
165
+
166
+ while (server->processing_msg) {
167
+ if (server->ioc->read_coroutine) {
168
+ server->ioc->read_coroutine = NULL;
169
+ qio_channel_set_aio_fd_handler(server->ioc, server->ioc->ctx, NULL,
170
+ NULL, server->ioc);
171
+ server->processing_msg = false;
172
+ }
173
+ }
174
+
175
+ vu_deinit(&server->vu_dev);
176
+ object_unref(OBJECT(sioc));
177
+ object_unref(OBJECT(server->ioc));
178
+}
179
+
180
+static void panic_cb(VuDev *vu_dev, const char *buf)
181
+{
182
+ VuServer *server = container_of(vu_dev, VuServer, vu_dev);
183
+
184
+ /* avoid while loop in close_client */
185
+ server->processing_msg = false;
186
+
187
+ if (buf) {
188
+ error_report("vu_panic: %s", buf);
189
+ }
190
+
191
+ if (server->sioc) {
192
+ close_client(server);
193
+ }
194
+
195
+ if (server->device_panic_notifier) {
196
+ server->device_panic_notifier(server);
197
+ }
198
+
199
+ /*
200
+ * Set the callback function for network listener so another
201
+ * vhost-user client can connect to this server
202
+ */
203
+ qio_net_listener_set_client_func(server->listener,
204
+ vu_accept,
205
+ server,
206
+ NULL);
207
+}
208
+
209
+static bool coroutine_fn
210
+vu_message_read(VuDev *vu_dev, int conn_fd, VhostUserMsg *vmsg)
211
+{
212
+ struct iovec iov = {
213
+ .iov_base = (char *)vmsg,
214
+ .iov_len = VHOST_USER_HDR_SIZE,
215
+ };
216
+ int rc, read_bytes = 0;
217
+ Error *local_err = NULL;
218
+ /*
219
+ * Store fds/nfds returned from qio_channel_readv_full into
220
+ * temporary variables.
221
+ *
222
+ * VhostUserMsg is a packed structure, gcc will complain about passing
223
+ * pointer to a packed structure member if we pass &VhostUserMsg.fd_num
224
+ * and &VhostUserMsg.fds directly when calling qio_channel_readv_full,
225
+ * thus two temporary variables nfds and fds are used here.
226
+ */
227
+ size_t nfds = 0, nfds_t = 0;
228
+ const size_t max_fds = G_N_ELEMENTS(vmsg->fds);
229
+ int *fds_t = NULL;
230
+ VuServer *server = container_of(vu_dev, VuServer, vu_dev);
231
+ QIOChannel *ioc = server->ioc;
232
+
233
+ if (!ioc) {
234
+ error_report_err(local_err);
235
+ goto fail;
236
+ }
237
+
238
+ assert(qemu_in_coroutine());
239
+ do {
240
+ /*
241
+ * qio_channel_readv_full may have short reads, keeping calling it
242
+ * until getting VHOST_USER_HDR_SIZE or 0 bytes in total
243
+ */
244
+ rc = qio_channel_readv_full(ioc, &iov, 1, &fds_t, &nfds_t, &local_err);
245
+ if (rc < 0) {
246
+ if (rc == QIO_CHANNEL_ERR_BLOCK) {
247
+ qio_channel_yield(ioc, G_IO_IN);
248
+ continue;
249
+ } else {
250
+ error_report_err(local_err);
251
+ return false;
252
+ }
253
+ }
254
+ read_bytes += rc;
255
+ if (nfds_t > 0) {
256
+ if (nfds + nfds_t > max_fds) {
257
+ error_report("A maximum of %zu fds are allowed, "
258
+ "however got %zu fds now",
259
+ max_fds, nfds + nfds_t);
260
+ goto fail;
261
+ }
262
+ memcpy(vmsg->fds + nfds, fds_t,
263
+ nfds_t *sizeof(vmsg->fds[0]));
264
+ nfds += nfds_t;
265
+ g_free(fds_t);
266
+ }
267
+ if (read_bytes == VHOST_USER_HDR_SIZE || rc == 0) {
268
+ break;
269
+ }
270
+ iov.iov_base = (char *)vmsg + read_bytes;
271
+ iov.iov_len = VHOST_USER_HDR_SIZE - read_bytes;
272
+ } while (true);
273
+
274
+ vmsg->fd_num = nfds;
275
+ /* qio_channel_readv_full will make socket fds blocking, unblock them */
276
+ vmsg_unblock_fds(vmsg);
277
+ if (vmsg->size > sizeof(vmsg->payload)) {
278
+ error_report("Error: too big message request: %d, "
279
+ "size: vmsg->size: %u, "
280
+ "while sizeof(vmsg->payload) = %zu",
281
+ vmsg->request, vmsg->size, sizeof(vmsg->payload));
282
+ goto fail;
283
+ }
284
+
285
+ struct iovec iov_payload = {
286
+ .iov_base = (char *)&vmsg->payload,
287
+ .iov_len = vmsg->size,
288
+ };
289
+ if (vmsg->size) {
290
+ rc = qio_channel_readv_all_eof(ioc, &iov_payload, 1, &local_err);
291
+ if (rc == -1) {
292
+ error_report_err(local_err);
293
+ goto fail;
294
+ }
295
+ }
296
+
297
+ return true;
298
+
299
+fail:
300
+ vmsg_close_fds(vmsg);
301
+
302
+ return false;
303
+}
304
+
305
+
306
+static void vu_client_start(VuServer *server);
307
+static coroutine_fn void vu_client_trip(void *opaque)
308
+{
309
+ VuServer *server = opaque;
310
+
311
+ while (!server->aio_context_changed && server->sioc) {
312
+ server->processing_msg = true;
313
+ vu_dispatch(&server->vu_dev);
314
+ server->processing_msg = false;
315
+ }
316
+
317
+ if (server->aio_context_changed && server->sioc) {
318
+ server->aio_context_changed = false;
319
+ vu_client_start(server);
320
+ }
321
+}
322
+
323
+static void vu_client_start(VuServer *server)
324
+{
325
+ server->co_trip = qemu_coroutine_create(vu_client_trip, server);
326
+ aio_co_enter(server->ctx, server->co_trip);
327
+}
328
+
329
+/*
330
+ * a wrapper for vu_kick_cb
331
+ *
332
+ * since aio_dispatch can only pass one user data pointer to the
333
+ * callback function, pack VuDev and pvt into a struct. Then unpack it
334
+ * and pass them to vu_kick_cb
335
+ */
336
+static void kick_handler(void *opaque)
337
+{
338
+ VuFdWatch *vu_fd_watch = opaque;
339
+ vu_fd_watch->processing = true;
340
+ vu_fd_watch->cb(vu_fd_watch->vu_dev, 0, vu_fd_watch->pvt);
341
+ vu_fd_watch->processing = false;
342
+}
343
+
344
+
345
+static VuFdWatch *find_vu_fd_watch(VuServer *server, int fd)
346
+{
347
+
348
+ VuFdWatch *vu_fd_watch, *next;
349
+ QTAILQ_FOREACH_SAFE(vu_fd_watch, &server->vu_fd_watches, next, next) {
350
+ if (vu_fd_watch->fd == fd) {
351
+ return vu_fd_watch;
352
+ }
353
+ }
354
+ return NULL;
355
+}
356
+
357
+static void
358
+set_watch(VuDev *vu_dev, int fd, int vu_evt,
359
+ vu_watch_cb cb, void *pvt)
360
+{
361
+
362
+ VuServer *server = container_of(vu_dev, VuServer, vu_dev);
363
+ g_assert(vu_dev);
364
+ g_assert(fd >= 0);
365
+ g_assert(cb);
366
+
367
+ VuFdWatch *vu_fd_watch = find_vu_fd_watch(server, fd);
368
+
369
+ if (!vu_fd_watch) {
370
+ VuFdWatch *vu_fd_watch = g_new0(VuFdWatch, 1);
371
+
372
+ QTAILQ_INSERT_TAIL(&server->vu_fd_watches, vu_fd_watch, next);
373
+
374
+ vu_fd_watch->fd = fd;
375
+ vu_fd_watch->cb = cb;
376
+ qemu_set_nonblock(fd);
377
+ aio_set_fd_handler(server->ioc->ctx, fd, true, kick_handler,
378
+ NULL, NULL, vu_fd_watch);
379
+ vu_fd_watch->vu_dev = vu_dev;
380
+ vu_fd_watch->pvt = pvt;
381
+ }
382
+}
383
+
384
+
385
+static void remove_watch(VuDev *vu_dev, int fd)
386
+{
387
+ VuServer *server;
388
+ g_assert(vu_dev);
389
+ g_assert(fd >= 0);
390
+
391
+ server = container_of(vu_dev, VuServer, vu_dev);
392
+
393
+ VuFdWatch *vu_fd_watch = find_vu_fd_watch(server, fd);
394
+
395
+ if (!vu_fd_watch) {
396
+ return;
397
+ }
398
+ aio_set_fd_handler(server->ioc->ctx, fd, true, NULL, NULL, NULL, NULL);
399
+
400
+ QTAILQ_REMOVE(&server->vu_fd_watches, vu_fd_watch, next);
401
+ g_free(vu_fd_watch);
402
+}
403
+
404
+
405
+static void vu_accept(QIONetListener *listener, QIOChannelSocket *sioc,
406
+ gpointer opaque)
407
+{
408
+ VuServer *server = opaque;
409
+
410
+ if (server->sioc) {
411
+ warn_report("Only one vhost-user client is allowed to "
412
+ "connect the server one time");
413
+ return;
414
+ }
415
+
416
+ if (!vu_init(&server->vu_dev, server->max_queues, sioc->fd, panic_cb,
417
+ vu_message_read, set_watch, remove_watch, server->vu_iface)) {
418
+ error_report("Failed to initialize libvhost-user");
419
+ return;
420
+ }
421
+
422
+ /*
423
+ * Unset the callback function for network listener to make another
424
+ * vhost-user client keeping waiting until this client disconnects
425
+ */
426
+ qio_net_listener_set_client_func(server->listener,
427
+ NULL,
428
+ NULL,
429
+ NULL);
430
+ server->sioc = sioc;
431
+ /*
432
+ * Increase the object reference, so sioc will not freed by
433
+ * qio_net_listener_channel_func which will call object_unref(OBJECT(sioc))
434
+ */
435
+ object_ref(OBJECT(server->sioc));
436
+ qio_channel_set_name(QIO_CHANNEL(sioc), "vhost-user client");
437
+ server->ioc = QIO_CHANNEL(sioc);
438
+ object_ref(OBJECT(server->ioc));
439
+ qio_channel_attach_aio_context(server->ioc, server->ctx);
440
+ qio_channel_set_blocking(QIO_CHANNEL(server->sioc), false, NULL);
441
+ vu_client_start(server);
442
+}
443
+
444
+
445
+void vhost_user_server_stop(VuServer *server)
446
+{
447
+ if (server->sioc) {
448
+ close_client(server);
449
+ }
450
+
451
+ if (server->listener) {
452
+ qio_net_listener_disconnect(server->listener);
453
+ object_unref(OBJECT(server->listener));
454
+ }
455
+
456
+}
457
+
458
+void vhost_user_server_set_aio_context(VuServer *server, AioContext *ctx)
459
+{
460
+ VuFdWatch *vu_fd_watch, *next;
461
+ void *opaque = NULL;
462
+ IOHandler *io_read = NULL;
463
+ bool attach;
464
+
465
+ server->ctx = ctx ? ctx : qemu_get_aio_context();
466
+
467
+ if (!server->sioc) {
468
+ /* not yet serving any client*/
469
+ return;
470
+ }
471
+
472
+ if (ctx) {
473
+ qio_channel_attach_aio_context(server->ioc, ctx);
474
+ server->aio_context_changed = true;
475
+ io_read = kick_handler;
476
+ attach = true;
477
+ } else {
478
+ qio_channel_detach_aio_context(server->ioc);
479
+ /* server->ioc->ctx keeps the old AioConext */
480
+ ctx = server->ioc->ctx;
481
+ attach = false;
482
+ }
483
+
484
+ QTAILQ_FOREACH_SAFE(vu_fd_watch, &server->vu_fd_watches, next, next) {
485
+ if (vu_fd_watch->cb) {
486
+ opaque = attach ? vu_fd_watch : NULL;
487
+ aio_set_fd_handler(ctx, vu_fd_watch->fd, true,
488
+ io_read, NULL, NULL,
489
+ opaque);
490
+ }
491
+ }
492
+}
493
+
494
+
495
+bool vhost_user_server_start(VuServer *server,
496
+ SocketAddress *socket_addr,
497
+ AioContext *ctx,
498
+ uint16_t max_queues,
499
+ DevicePanicNotifierFn *device_panic_notifier,
500
+ const VuDevIface *vu_iface,
501
+ Error **errp)
502
+{
503
+ QIONetListener *listener = qio_net_listener_new();
504
+ if (qio_net_listener_open_sync(listener, socket_addr, 1,
505
+ errp) < 0) {
506
+ object_unref(OBJECT(listener));
507
+ return false;
508
+ }
509
+
510
+ /* zero out unspecified fileds */
511
+ *server = (VuServer) {
512
+ .listener = listener,
513
+ .vu_iface = vu_iface,
514
+ .max_queues = max_queues,
515
+ .ctx = ctx,
516
+ .device_panic_notifier = device_panic_notifier,
517
+ };
518
+
519
+ qio_net_listener_set_name(server->listener, "vhost-user-backend-listener");
520
+
521
+ qio_net_listener_set_client_func(server->listener,
522
+ vu_accept,
523
+ server,
524
+ NULL);
525
+
526
+ QTAILQ_INIT(&server->vu_fd_watches);
527
+ return true;
528
+}
529
diff --git a/util/meson.build b/util/meson.build
530
index XXXXXXX..XXXXXXX 100644
531
--- a/util/meson.build
532
+++ b/util/meson.build
533
@@ -XXX,XX +XXX,XX @@ if have_block
534
util_ss.add(files('main-loop.c'))
535
util_ss.add(files('nvdimm-utils.c'))
536
util_ss.add(files('qemu-coroutine.c', 'qemu-coroutine-lock.c', 'qemu-coroutine-io.c'))
537
+ util_ss.add(when: 'CONFIG_LINUX', if_true: files('vhost-user-server.c'))
538
util_ss.add(files('qemu-coroutine-sleep.c'))
539
util_ss.add(files('qemu-co-shared-resource.c'))
540
util_ss.add(files('thread-pool.c', 'qemu-timer.c'))
541
--
542
2.26.2
543
diff view generated by jsdifflib
New patch
1
From: Coiby Xu <coiby.xu@gmail.com>
1
2
3
Move the constants from hw/core/qdev-properties.c to
4
util/block-helpers.h so that knowledge of the min/max values is
5
6
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
7
Signed-off-by: Coiby Xu <coiby.xu@gmail.com>
8
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
9
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
10
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
11
Message-id: 20200918080912.321299-5-coiby.xu@gmail.com
12
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
13
---
14
util/block-helpers.h | 19 +++++++++++++
15
hw/core/qdev-properties-system.c | 31 ++++-----------------
16
util/block-helpers.c | 46 ++++++++++++++++++++++++++++++++
17
util/meson.build | 1 +
18
4 files changed, 71 insertions(+), 26 deletions(-)
19
create mode 100644 util/block-helpers.h
20
create mode 100644 util/block-helpers.c
21
22
diff --git a/util/block-helpers.h b/util/block-helpers.h
23
new file mode 100644
24
index XXXXXXX..XXXXXXX
25
--- /dev/null
26
+++ b/util/block-helpers.h
27
@@ -XXX,XX +XXX,XX @@
28
+#ifndef BLOCK_HELPERS_H
29
+#define BLOCK_HELPERS_H
30
+
31
+#include "qemu/units.h"
32
+
33
+/* lower limit is sector size */
34
+#define MIN_BLOCK_SIZE INT64_C(512)
35
+#define MIN_BLOCK_SIZE_STR "512 B"
36
+/*
37
+ * upper limit is arbitrary, 2 MiB looks sufficient for all sensible uses, and
38
+ * matches qcow2 cluster size limit
39
+ */
40
+#define MAX_BLOCK_SIZE (2 * MiB)
41
+#define MAX_BLOCK_SIZE_STR "2 MiB"
42
+
43
+void check_block_size(const char *id, const char *name, int64_t value,
44
+ Error **errp);
45
+
46
+#endif /* BLOCK_HELPERS_H */
47
diff --git a/hw/core/qdev-properties-system.c b/hw/core/qdev-properties-system.c
48
index XXXXXXX..XXXXXXX 100644
49
--- a/hw/core/qdev-properties-system.c
50
+++ b/hw/core/qdev-properties-system.c
51
@@ -XXX,XX +XXX,XX @@
52
#include "sysemu/blockdev.h"
53
#include "net/net.h"
54
#include "hw/pci/pci.h"
55
+#include "util/block-helpers.h"
56
57
static bool check_prop_still_unset(DeviceState *dev, const char *name,
58
const void *old_val, const char *new_val,
59
@@ -XXX,XX +XXX,XX @@ const PropertyInfo qdev_prop_losttickpolicy = {
60
61
/* --- blocksize --- */
62
63
-/* lower limit is sector size */
64
-#define MIN_BLOCK_SIZE 512
65
-#define MIN_BLOCK_SIZE_STR "512 B"
66
-/*
67
- * upper limit is arbitrary, 2 MiB looks sufficient for all sensible uses, and
68
- * matches qcow2 cluster size limit
69
- */
70
-#define MAX_BLOCK_SIZE (2 * MiB)
71
-#define MAX_BLOCK_SIZE_STR "2 MiB"
72
-
73
static void set_blocksize(Object *obj, Visitor *v, const char *name,
74
void *opaque, Error **errp)
75
{
76
@@ -XXX,XX +XXX,XX @@ static void set_blocksize(Object *obj, Visitor *v, const char *name,
77
Property *prop = opaque;
78
uint32_t *ptr = qdev_get_prop_ptr(dev, prop);
79
uint64_t value;
80
+ Error *local_err = NULL;
81
82
if (dev->realized) {
83
qdev_prop_set_after_realize(dev, name, errp);
84
@@ -XXX,XX +XXX,XX @@ static void set_blocksize(Object *obj, Visitor *v, const char *name,
85
if (!visit_type_size(v, name, &value, errp)) {
86
return;
87
}
88
- /* value of 0 means "unset" */
89
- if (value && (value < MIN_BLOCK_SIZE || value > MAX_BLOCK_SIZE)) {
90
- error_setg(errp,
91
- "Property %s.%s doesn't take value %" PRIu64
92
- " (minimum: " MIN_BLOCK_SIZE_STR
93
- ", maximum: " MAX_BLOCK_SIZE_STR ")",
94
- dev->id ? : "", name, value);
95
+ check_block_size(dev->id ? : "", name, value, &local_err);
96
+ if (local_err) {
97
+ error_propagate(errp, local_err);
98
return;
99
}
100
-
101
- /* We rely on power-of-2 blocksizes for bitmasks */
102
- if ((value & (value - 1)) != 0) {
103
- error_setg(errp,
104
- "Property %s.%s doesn't take value '%" PRId64 "', "
105
- "it's not a power of 2", dev->id ?: "", name, (int64_t)value);
106
- return;
107
- }
108
-
109
*ptr = value;
110
}
111
112
diff --git a/util/block-helpers.c b/util/block-helpers.c
113
new file mode 100644
114
index XXXXXXX..XXXXXXX
115
--- /dev/null
116
+++ b/util/block-helpers.c
117
@@ -XXX,XX +XXX,XX @@
118
+/*
119
+ * Block utility functions
120
+ *
121
+ * Copyright IBM, Corp. 2011
122
+ * Copyright (c) 2020 Coiby Xu <coiby.xu@gmail.com>
123
+ *
124
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
125
+ * See the COPYING file in the top-level directory.
126
+ */
127
+
128
+#include "qemu/osdep.h"
129
+#include "qapi/error.h"
130
+#include "qapi/qmp/qerror.h"
131
+#include "block-helpers.h"
132
+
133
+/**
134
+ * check_block_size:
135
+ * @id: The unique ID of the object
136
+ * @name: The name of the property being validated
137
+ * @value: The block size in bytes
138
+ * @errp: A pointer to an area to store an error
139
+ *
140
+ * This function checks that the block size meets the following conditions:
141
+ * 1. At least MIN_BLOCK_SIZE
142
+ * 2. No larger than MAX_BLOCK_SIZE
143
+ * 3. A power of 2
144
+ */
145
+void check_block_size(const char *id, const char *name, int64_t value,
146
+ Error **errp)
147
+{
148
+ /* value of 0 means "unset" */
149
+ if (value && (value < MIN_BLOCK_SIZE || value > MAX_BLOCK_SIZE)) {
150
+ error_setg(errp, QERR_PROPERTY_VALUE_OUT_OF_RANGE,
151
+ id, name, value, MIN_BLOCK_SIZE, MAX_BLOCK_SIZE);
152
+ return;
153
+ }
154
+
155
+ /* We rely on power-of-2 blocksizes for bitmasks */
156
+ if ((value & (value - 1)) != 0) {
157
+ error_setg(errp,
158
+ "Property %s.%s doesn't take value '%" PRId64
159
+ "', it's not a power of 2",
160
+ id, name, value);
161
+ return;
162
+ }
163
+}
164
diff --git a/util/meson.build b/util/meson.build
165
index XXXXXXX..XXXXXXX 100644
166
--- a/util/meson.build
167
+++ b/util/meson.build
168
@@ -XXX,XX +XXX,XX @@ if have_block
169
util_ss.add(files('nvdimm-utils.c'))
170
util_ss.add(files('qemu-coroutine.c', 'qemu-coroutine-lock.c', 'qemu-coroutine-io.c'))
171
util_ss.add(when: 'CONFIG_LINUX', if_true: files('vhost-user-server.c'))
172
+ util_ss.add(files('block-helpers.c'))
173
util_ss.add(files('qemu-coroutine-sleep.c'))
174
util_ss.add(files('qemu-co-shared-resource.c'))
175
util_ss.add(files('thread-pool.c', 'qemu-timer.c'))
176
--
177
2.26.2
178
diff view generated by jsdifflib
New patch
1
From: Coiby Xu <coiby.xu@gmail.com>
1
2
3
By making use of libvhost-user, block device drive can be shared to
4
the connected vhost-user client. Only one client can connect to the
5
server one time.
6
7
Since vhost-user-server needs a block drive to be created first, delay
8
the creation of this object.
9
10
Suggested-by: Kevin Wolf <kwolf@redhat.com>
11
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
12
Signed-off-by: Coiby Xu <coiby.xu@gmail.com>
13
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
14
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
15
Message-id: 20200918080912.321299-6-coiby.xu@gmail.com
16
[Shorten "vhost_user_blk_server" string to "vhost_user_blk" to avoid the
17
following compiler warning:
18
../block/export/vhost-user-blk-server.c:178:50: error: ‘%s’ directive output truncated writing 21 bytes into a region of size 20 [-Werror=format-truncation=]
19
and fix "Invalid size %ld ..." ssize_t format string arguments for
20
32-bit hosts.
21
--Stefan]
22
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
23
---
24
block/export/vhost-user-blk-server.h | 36 ++
25
block/export/vhost-user-blk-server.c | 661 +++++++++++++++++++++++++++
26
softmmu/vl.c | 4 +
27
block/meson.build | 1 +
28
4 files changed, 702 insertions(+)
29
create mode 100644 block/export/vhost-user-blk-server.h
30
create mode 100644 block/export/vhost-user-blk-server.c
31
32
diff --git a/block/export/vhost-user-blk-server.h b/block/export/vhost-user-blk-server.h
33
new file mode 100644
34
index XXXXXXX..XXXXXXX
35
--- /dev/null
36
+++ b/block/export/vhost-user-blk-server.h
37
@@ -XXX,XX +XXX,XX @@
38
+/*
39
+ * Sharing QEMU block devices via vhost-user protocal
40
+ *
41
+ * Copyright (c) Coiby Xu <coiby.xu@gmail.com>.
42
+ * Copyright (c) 2020 Red Hat, Inc.
43
+ *
44
+ * This work is licensed under the terms of the GNU GPL, version 2 or
45
+ * later. See the COPYING file in the top-level directory.
46
+ */
47
+
48
+#ifndef VHOST_USER_BLK_SERVER_H
49
+#define VHOST_USER_BLK_SERVER_H
50
+#include "util/vhost-user-server.h"
51
+
52
+typedef struct VuBlockDev VuBlockDev;
53
+#define TYPE_VHOST_USER_BLK_SERVER "vhost-user-blk-server"
54
+#define VHOST_USER_BLK_SERVER(obj) \
55
+ OBJECT_CHECK(VuBlockDev, obj, TYPE_VHOST_USER_BLK_SERVER)
56
+
57
+/* vhost user block device */
58
+struct VuBlockDev {
59
+ Object parent_obj;
60
+ char *node_name;
61
+ SocketAddress *addr;
62
+ AioContext *ctx;
63
+ VuServer vu_server;
64
+ bool running;
65
+ uint32_t blk_size;
66
+ BlockBackend *backend;
67
+ QIOChannelSocket *sioc;
68
+ QTAILQ_ENTRY(VuBlockDev) next;
69
+ struct virtio_blk_config blkcfg;
70
+ bool writable;
71
+};
72
+
73
+#endif /* VHOST_USER_BLK_SERVER_H */
74
diff --git a/block/export/vhost-user-blk-server.c b/block/export/vhost-user-blk-server.c
75
new file mode 100644
76
index XXXXXXX..XXXXXXX
77
--- /dev/null
78
+++ b/block/export/vhost-user-blk-server.c
79
@@ -XXX,XX +XXX,XX @@
80
+/*
81
+ * Sharing QEMU block devices via vhost-user protocal
82
+ *
83
+ * Parts of the code based on nbd/server.c.
84
+ *
85
+ * Copyright (c) Coiby Xu <coiby.xu@gmail.com>.
86
+ * Copyright (c) 2020 Red Hat, Inc.
87
+ *
88
+ * This work is licensed under the terms of the GNU GPL, version 2 or
89
+ * later. See the COPYING file in the top-level directory.
90
+ */
91
+#include "qemu/osdep.h"
92
+#include "block/block.h"
93
+#include "vhost-user-blk-server.h"
94
+#include "qapi/error.h"
95
+#include "qom/object_interfaces.h"
96
+#include "sysemu/block-backend.h"
97
+#include "util/block-helpers.h"
98
+
99
+enum {
100
+ VHOST_USER_BLK_MAX_QUEUES = 1,
101
+};
102
+struct virtio_blk_inhdr {
103
+ unsigned char status;
104
+};
105
+
106
+typedef struct VuBlockReq {
107
+ VuVirtqElement *elem;
108
+ int64_t sector_num;
109
+ size_t size;
110
+ struct virtio_blk_inhdr *in;
111
+ struct virtio_blk_outhdr out;
112
+ VuServer *server;
113
+ struct VuVirtq *vq;
114
+} VuBlockReq;
115
+
116
+static void vu_block_req_complete(VuBlockReq *req)
117
+{
118
+ VuDev *vu_dev = &req->server->vu_dev;
119
+
120
+ /* IO size with 1 extra status byte */
121
+ vu_queue_push(vu_dev, req->vq, req->elem, req->size + 1);
122
+ vu_queue_notify(vu_dev, req->vq);
123
+
124
+ if (req->elem) {
125
+ free(req->elem);
126
+ }
127
+
128
+ g_free(req);
129
+}
130
+
131
+static VuBlockDev *get_vu_block_device_by_server(VuServer *server)
132
+{
133
+ return container_of(server, VuBlockDev, vu_server);
134
+}
135
+
136
+static int coroutine_fn
137
+vu_block_discard_write_zeroes(VuBlockReq *req, struct iovec *iov,
138
+ uint32_t iovcnt, uint32_t type)
139
+{
140
+ struct virtio_blk_discard_write_zeroes desc;
141
+ ssize_t size = iov_to_buf(iov, iovcnt, 0, &desc, sizeof(desc));
142
+ if (unlikely(size != sizeof(desc))) {
143
+ error_report("Invalid size %zd, expect %zu", size, sizeof(desc));
144
+ return -EINVAL;
145
+ }
146
+
147
+ VuBlockDev *vdev_blk = get_vu_block_device_by_server(req->server);
148
+ uint64_t range[2] = { le64_to_cpu(desc.sector) << 9,
149
+ le32_to_cpu(desc.num_sectors) << 9 };
150
+ if (type == VIRTIO_BLK_T_DISCARD) {
151
+ if (blk_co_pdiscard(vdev_blk->backend, range[0], range[1]) == 0) {
152
+ return 0;
153
+ }
154
+ } else if (type == VIRTIO_BLK_T_WRITE_ZEROES) {
155
+ if (blk_co_pwrite_zeroes(vdev_blk->backend,
156
+ range[0], range[1], 0) == 0) {
157
+ return 0;
158
+ }
159
+ }
160
+
161
+ return -EINVAL;
162
+}
163
+
164
+static void coroutine_fn vu_block_flush(VuBlockReq *req)
165
+{
166
+ VuBlockDev *vdev_blk = get_vu_block_device_by_server(req->server);
167
+ BlockBackend *backend = vdev_blk->backend;
168
+ blk_co_flush(backend);
169
+}
170
+
171
+struct req_data {
172
+ VuServer *server;
173
+ VuVirtq *vq;
174
+ VuVirtqElement *elem;
175
+};
176
+
177
+static void coroutine_fn vu_block_virtio_process_req(void *opaque)
178
+{
179
+ struct req_data *data = opaque;
180
+ VuServer *server = data->server;
181
+ VuVirtq *vq = data->vq;
182
+ VuVirtqElement *elem = data->elem;
183
+ uint32_t type;
184
+ VuBlockReq *req;
185
+
186
+ VuBlockDev *vdev_blk = get_vu_block_device_by_server(server);
187
+ BlockBackend *backend = vdev_blk->backend;
188
+
189
+ struct iovec *in_iov = elem->in_sg;
190
+ struct iovec *out_iov = elem->out_sg;
191
+ unsigned in_num = elem->in_num;
192
+ unsigned out_num = elem->out_num;
193
+ /* refer to hw/block/virtio_blk.c */
194
+ if (elem->out_num < 1 || elem->in_num < 1) {
195
+ error_report("virtio-blk request missing headers");
196
+ free(elem);
197
+ return;
198
+ }
199
+
200
+ req = g_new0(VuBlockReq, 1);
201
+ req->server = server;
202
+ req->vq = vq;
203
+ req->elem = elem;
204
+
205
+ if (unlikely(iov_to_buf(out_iov, out_num, 0, &req->out,
206
+ sizeof(req->out)) != sizeof(req->out))) {
207
+ error_report("virtio-blk request outhdr too short");
208
+ goto err;
209
+ }
210
+
211
+ iov_discard_front(&out_iov, &out_num, sizeof(req->out));
212
+
213
+ if (in_iov[in_num - 1].iov_len < sizeof(struct virtio_blk_inhdr)) {
214
+ error_report("virtio-blk request inhdr too short");
215
+ goto err;
216
+ }
217
+
218
+ /* We always touch the last byte, so just see how big in_iov is. */
219
+ req->in = (void *)in_iov[in_num - 1].iov_base
220
+ + in_iov[in_num - 1].iov_len
221
+ - sizeof(struct virtio_blk_inhdr);
222
+ iov_discard_back(in_iov, &in_num, sizeof(struct virtio_blk_inhdr));
223
+
224
+ type = le32_to_cpu(req->out.type);
225
+ switch (type & ~VIRTIO_BLK_T_BARRIER) {
226
+ case VIRTIO_BLK_T_IN:
227
+ case VIRTIO_BLK_T_OUT: {
228
+ ssize_t ret = 0;
229
+ bool is_write = type & VIRTIO_BLK_T_OUT;
230
+ req->sector_num = le64_to_cpu(req->out.sector);
231
+
232
+ int64_t offset = req->sector_num * vdev_blk->blk_size;
233
+ QEMUIOVector qiov;
234
+ if (is_write) {
235
+ qemu_iovec_init_external(&qiov, out_iov, out_num);
236
+ ret = blk_co_pwritev(backend, offset, qiov.size,
237
+ &qiov, 0);
238
+ } else {
239
+ qemu_iovec_init_external(&qiov, in_iov, in_num);
240
+ ret = blk_co_preadv(backend, offset, qiov.size,
241
+ &qiov, 0);
242
+ }
243
+ if (ret >= 0) {
244
+ req->in->status = VIRTIO_BLK_S_OK;
245
+ } else {
246
+ req->in->status = VIRTIO_BLK_S_IOERR;
247
+ }
248
+ break;
249
+ }
250
+ case VIRTIO_BLK_T_FLUSH:
251
+ vu_block_flush(req);
252
+ req->in->status = VIRTIO_BLK_S_OK;
253
+ break;
254
+ case VIRTIO_BLK_T_GET_ID: {
255
+ size_t size = MIN(iov_size(&elem->in_sg[0], in_num),
256
+ VIRTIO_BLK_ID_BYTES);
257
+ snprintf(elem->in_sg[0].iov_base, size, "%s", "vhost_user_blk");
258
+ req->in->status = VIRTIO_BLK_S_OK;
259
+ req->size = elem->in_sg[0].iov_len;
260
+ break;
261
+ }
262
+ case VIRTIO_BLK_T_DISCARD:
263
+ case VIRTIO_BLK_T_WRITE_ZEROES: {
264
+ int rc;
265
+ rc = vu_block_discard_write_zeroes(req, &elem->out_sg[1],
266
+ out_num, type);
267
+ if (rc == 0) {
268
+ req->in->status = VIRTIO_BLK_S_OK;
269
+ } else {
270
+ req->in->status = VIRTIO_BLK_S_IOERR;
271
+ }
272
+ break;
273
+ }
274
+ default:
275
+ req->in->status = VIRTIO_BLK_S_UNSUPP;
276
+ break;
277
+ }
278
+
279
+ vu_block_req_complete(req);
280
+ return;
281
+
282
+err:
283
+ free(elem);
284
+ g_free(req);
285
+ return;
286
+}
287
+
288
+static void vu_block_process_vq(VuDev *vu_dev, int idx)
289
+{
290
+ VuServer *server;
291
+ VuVirtq *vq;
292
+ struct req_data *req_data;
293
+
294
+ server = container_of(vu_dev, VuServer, vu_dev);
295
+ assert(server);
296
+
297
+ vq = vu_get_queue(vu_dev, idx);
298
+ assert(vq);
299
+ VuVirtqElement *elem;
300
+ while (1) {
301
+ elem = vu_queue_pop(vu_dev, vq, sizeof(VuVirtqElement) +
302
+ sizeof(VuBlockReq));
303
+ if (elem) {
304
+ req_data = g_new0(struct req_data, 1);
305
+ req_data->server = server;
306
+ req_data->vq = vq;
307
+ req_data->elem = elem;
308
+ Coroutine *co = qemu_coroutine_create(vu_block_virtio_process_req,
309
+ req_data);
310
+ aio_co_enter(server->ioc->ctx, co);
311
+ } else {
312
+ break;
313
+ }
314
+ }
315
+}
316
+
317
+static void vu_block_queue_set_started(VuDev *vu_dev, int idx, bool started)
318
+{
319
+ VuVirtq *vq;
320
+
321
+ assert(vu_dev);
322
+
323
+ vq = vu_get_queue(vu_dev, idx);
324
+ vu_set_queue_handler(vu_dev, vq, started ? vu_block_process_vq : NULL);
325
+}
326
+
327
+static uint64_t vu_block_get_features(VuDev *dev)
328
+{
329
+ uint64_t features;
330
+ VuServer *server = container_of(dev, VuServer, vu_dev);
331
+ VuBlockDev *vdev_blk = get_vu_block_device_by_server(server);
332
+ features = 1ull << VIRTIO_BLK_F_SIZE_MAX |
333
+ 1ull << VIRTIO_BLK_F_SEG_MAX |
334
+ 1ull << VIRTIO_BLK_F_TOPOLOGY |
335
+ 1ull << VIRTIO_BLK_F_BLK_SIZE |
336
+ 1ull << VIRTIO_BLK_F_FLUSH |
337
+ 1ull << VIRTIO_BLK_F_DISCARD |
338
+ 1ull << VIRTIO_BLK_F_WRITE_ZEROES |
339
+ 1ull << VIRTIO_BLK_F_CONFIG_WCE |
340
+ 1ull << VIRTIO_F_VERSION_1 |
341
+ 1ull << VIRTIO_RING_F_INDIRECT_DESC |
342
+ 1ull << VIRTIO_RING_F_EVENT_IDX |
343
+ 1ull << VHOST_USER_F_PROTOCOL_FEATURES;
344
+
345
+ if (!vdev_blk->writable) {
346
+ features |= 1ull << VIRTIO_BLK_F_RO;
347
+ }
348
+
349
+ return features;
350
+}
351
+
352
+static uint64_t vu_block_get_protocol_features(VuDev *dev)
353
+{
354
+ return 1ull << VHOST_USER_PROTOCOL_F_CONFIG |
355
+ 1ull << VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD;
356
+}
357
+
358
+static int
359
+vu_block_get_config(VuDev *vu_dev, uint8_t *config, uint32_t len)
360
+{
361
+ VuServer *server = container_of(vu_dev, VuServer, vu_dev);
362
+ VuBlockDev *vdev_blk = get_vu_block_device_by_server(server);
363
+ memcpy(config, &vdev_blk->blkcfg, len);
364
+
365
+ return 0;
366
+}
367
+
368
+static int
369
+vu_block_set_config(VuDev *vu_dev, const uint8_t *data,
370
+ uint32_t offset, uint32_t size, uint32_t flags)
371
+{
372
+ VuServer *server = container_of(vu_dev, VuServer, vu_dev);
373
+ VuBlockDev *vdev_blk = get_vu_block_device_by_server(server);
374
+ uint8_t wce;
375
+
376
+ /* don't support live migration */
377
+ if (flags != VHOST_SET_CONFIG_TYPE_MASTER) {
378
+ return -EINVAL;
379
+ }
380
+
381
+ if (offset != offsetof(struct virtio_blk_config, wce) ||
382
+ size != 1) {
383
+ return -EINVAL;
384
+ }
385
+
386
+ wce = *data;
387
+ vdev_blk->blkcfg.wce = wce;
388
+ blk_set_enable_write_cache(vdev_blk->backend, wce);
389
+ return 0;
390
+}
391
+
392
+/*
393
+ * When the client disconnects, it sends a VHOST_USER_NONE request
394
+ * and vu_process_message will simple call exit which cause the VM
395
+ * to exit abruptly.
396
+ * To avoid this issue, process VHOST_USER_NONE request ahead
397
+ * of vu_process_message.
398
+ *
399
+ */
400
+static int vu_block_process_msg(VuDev *dev, VhostUserMsg *vmsg, int *do_reply)
401
+{
402
+ if (vmsg->request == VHOST_USER_NONE) {
403
+ dev->panic(dev, "disconnect");
404
+ return true;
405
+ }
406
+ return false;
407
+}
408
+
409
+static const VuDevIface vu_block_iface = {
410
+ .get_features = vu_block_get_features,
411
+ .queue_set_started = vu_block_queue_set_started,
412
+ .get_protocol_features = vu_block_get_protocol_features,
413
+ .get_config = vu_block_get_config,
414
+ .set_config = vu_block_set_config,
415
+ .process_msg = vu_block_process_msg,
416
+};
417
+
418
+static void blk_aio_attached(AioContext *ctx, void *opaque)
419
+{
420
+ VuBlockDev *vub_dev = opaque;
421
+ aio_context_acquire(ctx);
422
+ vhost_user_server_set_aio_context(&vub_dev->vu_server, ctx);
423
+ aio_context_release(ctx);
424
+}
425
+
426
+static void blk_aio_detach(void *opaque)
427
+{
428
+ VuBlockDev *vub_dev = opaque;
429
+ AioContext *ctx = vub_dev->vu_server.ctx;
430
+ aio_context_acquire(ctx);
431
+ vhost_user_server_set_aio_context(&vub_dev->vu_server, NULL);
432
+ aio_context_release(ctx);
433
+}
434
+
435
+static void
436
+vu_block_initialize_config(BlockDriverState *bs,
437
+ struct virtio_blk_config *config, uint32_t blk_size)
438
+{
439
+ config->capacity = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
440
+ config->blk_size = blk_size;
441
+ config->size_max = 0;
442
+ config->seg_max = 128 - 2;
443
+ config->min_io_size = 1;
444
+ config->opt_io_size = 1;
445
+ config->num_queues = VHOST_USER_BLK_MAX_QUEUES;
446
+ config->max_discard_sectors = 32768;
447
+ config->max_discard_seg = 1;
448
+ config->discard_sector_alignment = config->blk_size >> 9;
449
+ config->max_write_zeroes_sectors = 32768;
450
+ config->max_write_zeroes_seg = 1;
451
+}
452
+
453
+static VuBlockDev *vu_block_init(VuBlockDev *vu_block_device, Error **errp)
454
+{
455
+
456
+ BlockBackend *blk;
457
+ Error *local_error = NULL;
458
+ const char *node_name = vu_block_device->node_name;
459
+ bool writable = vu_block_device->writable;
460
+ uint64_t perm = BLK_PERM_CONSISTENT_READ;
461
+ int ret;
462
+
463
+ AioContext *ctx;
464
+
465
+ BlockDriverState *bs = bdrv_lookup_bs(node_name, node_name, &local_error);
466
+
467
+ if (!bs) {
468
+ error_propagate(errp, local_error);
469
+ return NULL;
470
+ }
471
+
472
+ if (bdrv_is_read_only(bs)) {
473
+ writable = false;
474
+ }
475
+
476
+ if (writable) {
477
+ perm |= BLK_PERM_WRITE;
478
+ }
479
+
480
+ ctx = bdrv_get_aio_context(bs);
481
+ aio_context_acquire(ctx);
482
+ bdrv_invalidate_cache(bs, NULL);
483
+ aio_context_release(ctx);
484
+
485
+ /*
486
+ * Don't allow resize while the vhost user server is running,
487
+ * otherwise we don't care what happens with the node.
488
+ */
489
+ blk = blk_new(bdrv_get_aio_context(bs), perm,
490
+ BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE_UNCHANGED |
491
+ BLK_PERM_WRITE | BLK_PERM_GRAPH_MOD);
492
+ ret = blk_insert_bs(blk, bs, errp);
493
+
494
+ if (ret < 0) {
495
+ goto fail;
496
+ }
497
+
498
+ blk_set_enable_write_cache(blk, false);
499
+
500
+ blk_set_allow_aio_context_change(blk, true);
501
+
502
+ vu_block_device->blkcfg.wce = 0;
503
+ vu_block_device->backend = blk;
504
+ if (!vu_block_device->blk_size) {
505
+ vu_block_device->blk_size = BDRV_SECTOR_SIZE;
506
+ }
507
+ vu_block_device->blkcfg.blk_size = vu_block_device->blk_size;
508
+ blk_set_guest_block_size(blk, vu_block_device->blk_size);
509
+ vu_block_initialize_config(bs, &vu_block_device->blkcfg,
510
+ vu_block_device->blk_size);
511
+ return vu_block_device;
512
+
513
+fail:
514
+ blk_unref(blk);
515
+ return NULL;
516
+}
517
+
518
+static void vu_block_deinit(VuBlockDev *vu_block_device)
519
+{
520
+ if (vu_block_device->backend) {
521
+ blk_remove_aio_context_notifier(vu_block_device->backend, blk_aio_attached,
522
+ blk_aio_detach, vu_block_device);
523
+ }
524
+
525
+ blk_unref(vu_block_device->backend);
526
+}
527
+
528
+static void vhost_user_blk_server_stop(VuBlockDev *vu_block_device)
529
+{
530
+ vhost_user_server_stop(&vu_block_device->vu_server);
531
+ vu_block_deinit(vu_block_device);
532
+}
533
+
534
+static void vhost_user_blk_server_start(VuBlockDev *vu_block_device,
535
+ Error **errp)
536
+{
537
+ AioContext *ctx;
538
+ SocketAddress *addr = vu_block_device->addr;
539
+
540
+ if (!vu_block_init(vu_block_device, errp)) {
541
+ return;
542
+ }
543
+
544
+ ctx = bdrv_get_aio_context(blk_bs(vu_block_device->backend));
545
+
546
+ if (!vhost_user_server_start(&vu_block_device->vu_server, addr, ctx,
547
+ VHOST_USER_BLK_MAX_QUEUES,
548
+ NULL, &vu_block_iface,
549
+ errp)) {
550
+ goto error;
551
+ }
552
+
553
+ blk_add_aio_context_notifier(vu_block_device->backend, blk_aio_attached,
554
+ blk_aio_detach, vu_block_device);
555
+ vu_block_device->running = true;
556
+ return;
557
+
558
+ error:
559
+ vu_block_deinit(vu_block_device);
560
+}
561
+
562
+static bool vu_prop_modifiable(VuBlockDev *vus, Error **errp)
563
+{
564
+ if (vus->running) {
565
+ error_setg(errp, "The property can't be modified "
566
+ "while the server is running");
567
+ return false;
568
+ }
569
+ return true;
570
+}
571
+
572
+static void vu_set_node_name(Object *obj, const char *value, Error **errp)
573
+{
574
+ VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
575
+
576
+ if (!vu_prop_modifiable(vus, errp)) {
577
+ return;
578
+ }
579
+
580
+ if (vus->node_name) {
581
+ g_free(vus->node_name);
582
+ }
583
+
584
+ vus->node_name = g_strdup(value);
585
+}
586
+
587
+static char *vu_get_node_name(Object *obj, Error **errp)
588
+{
589
+ VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
590
+ return g_strdup(vus->node_name);
591
+}
592
+
593
+static void free_socket_addr(SocketAddress *addr)
594
+{
595
+ g_free(addr->u.q_unix.path);
596
+ g_free(addr);
597
+}
598
+
599
+static void vu_set_unix_socket(Object *obj, const char *value,
600
+ Error **errp)
601
+{
602
+ VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
603
+
604
+ if (!vu_prop_modifiable(vus, errp)) {
605
+ return;
606
+ }
607
+
608
+ if (vus->addr) {
609
+ free_socket_addr(vus->addr);
610
+ }
611
+
612
+ SocketAddress *addr = g_new0(SocketAddress, 1);
613
+ addr->type = SOCKET_ADDRESS_TYPE_UNIX;
614
+ addr->u.q_unix.path = g_strdup(value);
615
+ vus->addr = addr;
616
+}
617
+
618
+static char *vu_get_unix_socket(Object *obj, Error **errp)
619
+{
620
+ VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
621
+ return g_strdup(vus->addr->u.q_unix.path);
622
+}
623
+
624
+static bool vu_get_block_writable(Object *obj, Error **errp)
625
+{
626
+ VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
627
+ return vus->writable;
628
+}
629
+
630
+static void vu_set_block_writable(Object *obj, bool value, Error **errp)
631
+{
632
+ VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
633
+
634
+ if (!vu_prop_modifiable(vus, errp)) {
635
+ return;
636
+ }
637
+
638
+ vus->writable = value;
639
+}
640
+
641
+static void vu_get_blk_size(Object *obj, Visitor *v, const char *name,
642
+ void *opaque, Error **errp)
643
+{
644
+ VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
645
+ uint32_t value = vus->blk_size;
646
+
647
+ visit_type_uint32(v, name, &value, errp);
648
+}
649
+
650
+static void vu_set_blk_size(Object *obj, Visitor *v, const char *name,
651
+ void *opaque, Error **errp)
652
+{
653
+ VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
654
+
655
+ Error *local_err = NULL;
656
+ uint32_t value;
657
+
658
+ if (!vu_prop_modifiable(vus, errp)) {
659
+ return;
660
+ }
661
+
662
+ visit_type_uint32(v, name, &value, &local_err);
663
+ if (local_err) {
664
+ goto out;
665
+ }
666
+
667
+ check_block_size(object_get_typename(obj), name, value, &local_err);
668
+ if (local_err) {
669
+ goto out;
670
+ }
671
+
672
+ vus->blk_size = value;
673
+
674
+out:
675
+ error_propagate(errp, local_err);
676
+}
677
+
678
+static void vhost_user_blk_server_instance_finalize(Object *obj)
679
+{
680
+ VuBlockDev *vub = VHOST_USER_BLK_SERVER(obj);
681
+
682
+ vhost_user_blk_server_stop(vub);
683
+
684
+ /*
685
+ * Unlike object_property_add_str, object_class_property_add_str
686
+ * doesn't have a release method. Thus manual memory freeing is
687
+ * needed.
688
+ */
689
+ free_socket_addr(vub->addr);
690
+ g_free(vub->node_name);
691
+}
692
+
693
+static void vhost_user_blk_server_complete(UserCreatable *obj, Error **errp)
694
+{
695
+ VuBlockDev *vub = VHOST_USER_BLK_SERVER(obj);
696
+
697
+ vhost_user_blk_server_start(vub, errp);
698
+}
699
+
700
+static void vhost_user_blk_server_class_init(ObjectClass *klass,
701
+ void *class_data)
702
+{
703
+ UserCreatableClass *ucc = USER_CREATABLE_CLASS(klass);
704
+ ucc->complete = vhost_user_blk_server_complete;
705
+
706
+ object_class_property_add_bool(klass, "writable",
707
+ vu_get_block_writable,
708
+ vu_set_block_writable);
709
+
710
+ object_class_property_add_str(klass, "node-name",
711
+ vu_get_node_name,
712
+ vu_set_node_name);
713
+
714
+ object_class_property_add_str(klass, "unix-socket",
715
+ vu_get_unix_socket,
716
+ vu_set_unix_socket);
717
+
718
+ object_class_property_add(klass, "logical-block-size", "uint32",
719
+ vu_get_blk_size, vu_set_blk_size,
720
+ NULL, NULL);
721
+}
722
+
723
+static const TypeInfo vhost_user_blk_server_info = {
724
+ .name = TYPE_VHOST_USER_BLK_SERVER,
725
+ .parent = TYPE_OBJECT,
726
+ .instance_size = sizeof(VuBlockDev),
727
+ .instance_finalize = vhost_user_blk_server_instance_finalize,
728
+ .class_init = vhost_user_blk_server_class_init,
729
+ .interfaces = (InterfaceInfo[]) {
730
+ {TYPE_USER_CREATABLE},
731
+ {}
732
+ },
733
+};
734
+
735
+static void vhost_user_blk_server_register_types(void)
736
+{
737
+ type_register_static(&vhost_user_blk_server_info);
738
+}
739
+
740
+type_init(vhost_user_blk_server_register_types)
741
diff --git a/softmmu/vl.c b/softmmu/vl.c
742
index XXXXXXX..XXXXXXX 100644
743
--- a/softmmu/vl.c
744
+++ b/softmmu/vl.c
745
@@ -XXX,XX +XXX,XX @@ static bool object_create_initial(const char *type, QemuOpts *opts)
746
}
747
#endif
748
749
+ /* Reason: vhost-user-blk-server property "node-name" */
750
+ if (g_str_equal(type, "vhost-user-blk-server")) {
751
+ return false;
752
+ }
753
/*
754
* Reason: filter-* property "netdev" etc.
755
*/
756
diff --git a/block/meson.build b/block/meson.build
757
index XXXXXXX..XXXXXXX 100644
758
--- a/block/meson.build
759
+++ b/block/meson.build
760
@@ -XXX,XX +XXX,XX @@ block_ss.add(when: 'CONFIG_WIN32', if_true: files('file-win32.c', 'win32-aio.c')
761
block_ss.add(when: 'CONFIG_POSIX', if_true: [files('file-posix.c'), coref, iokit])
762
block_ss.add(when: 'CONFIG_LIBISCSI', if_true: files('iscsi-opts.c'))
763
block_ss.add(when: 'CONFIG_LINUX', if_true: files('nvme.c'))
764
+block_ss.add(when: 'CONFIG_LINUX', if_true: files('export/vhost-user-blk-server.c', '../contrib/libvhost-user/libvhost-user.c'))
765
block_ss.add(when: 'CONFIG_REPLICATION', if_true: files('replication.c'))
766
block_ss.add(when: 'CONFIG_SHEEPDOG', if_true: files('sheepdog.c'))
767
block_ss.add(when: ['CONFIG_LINUX_AIO', libaio], if_true: files('linux-aio.c'))
768
--
769
2.26.2
770
diff view generated by jsdifflib
New patch
1
From: Coiby Xu <coiby.xu@gmail.com>
1
2
3
Suggested-by: Stefano Garzarella <sgarzare@redhat.com>
4
Signed-off-by: Coiby Xu <coiby.xu@gmail.com>
5
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
6
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
7
Message-id: 20200918080912.321299-8-coiby.xu@gmail.com
8
[Removed reference to vhost-user-blk-test.c, it will be sent in a
9
separate pull request.
10
--Stefan]
11
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
12
---
13
MAINTAINERS | 7 +++++++
14
1 file changed, 7 insertions(+)
15
16
diff --git a/MAINTAINERS b/MAINTAINERS
17
index XXXXXXX..XXXXXXX 100644
18
--- a/MAINTAINERS
19
+++ b/MAINTAINERS
20
@@ -XXX,XX +XXX,XX @@ L: qemu-block@nongnu.org
21
S: Supported
22
F: tests/image-fuzzer/
23
24
+Vhost-user block device backend server
25
+M: Coiby Xu <Coiby.Xu@gmail.com>
26
+S: Maintained
27
+F: block/export/vhost-user-blk-server.c
28
+F: util/vhost-user-server.c
29
+F: tests/qtest/libqos/vhost-user-blk.c
30
+
31
Replication
32
M: Wen Congyang <wencongyang2@huawei.com>
33
M: Xie Changlong <xiechanglong.d@gmail.com>
34
--
35
2.26.2
36
diff view generated by jsdifflib
New patch
1
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2
Message-id: 20200924151549.913737-3-stefanha@redhat.com
3
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
4
---
5
util/vhost-user-server.c | 2 +-
6
1 file changed, 1 insertion(+), 1 deletion(-)
1
7
8
diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
9
index XXXXXXX..XXXXXXX 100644
10
--- a/util/vhost-user-server.c
11
+++ b/util/vhost-user-server.c
12
@@ -XXX,XX +XXX,XX @@ bool vhost_user_server_start(VuServer *server,
13
return false;
14
}
15
16
- /* zero out unspecified fileds */
17
+ /* zero out unspecified fields */
18
*server = (VuServer) {
19
.listener = listener,
20
.vu_iface = vu_iface,
21
--
22
2.26.2
23
diff view generated by jsdifflib
New patch
1
We already have access to the value with the correct type (ioc and sioc
2
are the same QIOChannel).
1
3
4
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
5
Message-id: 20200924151549.913737-4-stefanha@redhat.com
6
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
7
---
8
util/vhost-user-server.c | 2 +-
9
1 file changed, 1 insertion(+), 1 deletion(-)
10
11
diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
12
index XXXXXXX..XXXXXXX 100644
13
--- a/util/vhost-user-server.c
14
+++ b/util/vhost-user-server.c
15
@@ -XXX,XX +XXX,XX @@ static void vu_accept(QIONetListener *listener, QIOChannelSocket *sioc,
16
server->ioc = QIO_CHANNEL(sioc);
17
object_ref(OBJECT(server->ioc));
18
qio_channel_attach_aio_context(server->ioc, server->ctx);
19
- qio_channel_set_blocking(QIO_CHANNEL(server->sioc), false, NULL);
20
+ qio_channel_set_blocking(server->ioc, false, NULL);
21
vu_client_start(server);
22
}
23
24
--
25
2.26.2
26
diff view generated by jsdifflib
1
Instead of checking bs->wps or bs->bl.zone_size for whether zone
1
Explicitly deleting watches is not necessary since libvhost-user calls
2
information is present, check bs->bl.zoned. That is the flag that
2
remove_watch() during vu_deinit(). Add an assertion to check this
3
raw_refresh_zoned_limits() reliably sets to indicate zone support. If
3
though.
4
it is set to something other than BLK_Z_NONE, other values and objects
5
like bs->wps and bs->bl.zone_size must be non-null/zero and valid; if it
6
is not, we cannot rely on their validity.
7
4
8
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
5
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
9
Message-Id: <20230824155345.109765-3-hreitz@redhat.com>
6
Message-id: 20200924151549.913737-5-stefanha@redhat.com
10
Reviewed-by: Sam Li <faithilikerun@gmail.com>
7
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
11
---
8
---
12
block/file-posix.c | 12 +++++++-----
9
util/vhost-user-server.c | 19 ++++---------------
13
1 file changed, 7 insertions(+), 5 deletions(-)
10
1 file changed, 4 insertions(+), 15 deletions(-)
14
11
15
diff --git a/block/file-posix.c b/block/file-posix.c
12
diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
16
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
17
--- a/block/file-posix.c
14
--- a/util/vhost-user-server.c
18
+++ b/block/file-posix.c
15
+++ b/util/vhost-user-server.c
19
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn raw_co_prw(BlockDriverState *bs, uint64_t offset,
16
@@ -XXX,XX +XXX,XX @@ static void close_client(VuServer *server)
20
if (fd_open(bs) < 0)
17
/* When this is set vu_client_trip will stop new processing vhost-user message */
21
return -EIO;
18
server->sioc = NULL;
22
#if defined(CONFIG_BLKZONED)
19
23
- if ((type & (QEMU_AIO_WRITE | QEMU_AIO_ZONE_APPEND)) && bs->wps) {
20
- VuFdWatch *vu_fd_watch, *next;
24
+ if ((type & (QEMU_AIO_WRITE | QEMU_AIO_ZONE_APPEND)) &&
21
- QTAILQ_FOREACH_SAFE(vu_fd_watch, &server->vu_fd_watches, next, next) {
25
+ bs->bl.zoned != BLK_Z_NONE) {
22
- aio_set_fd_handler(server->ioc->ctx, vu_fd_watch->fd, true, NULL,
26
qemu_co_mutex_lock(&bs->wps->colock);
23
- NULL, NULL, NULL);
27
- if (type & QEMU_AIO_ZONE_APPEND && bs->bl.zone_size) {
24
- }
28
+ if (type & QEMU_AIO_ZONE_APPEND) {
25
-
29
int index = offset / bs->bl.zone_size;
26
- while (!QTAILQ_EMPTY(&server->vu_fd_watches)) {
30
offset = bs->wps->wp[index];
27
- QTAILQ_FOREACH_SAFE(vu_fd_watch, &server->vu_fd_watches, next, next) {
31
}
28
- if (!vu_fd_watch->processing) {
32
@@ -XXX,XX +XXX,XX @@ out:
29
- QTAILQ_REMOVE(&server->vu_fd_watches, vu_fd_watch, next);
33
{
30
- g_free(vu_fd_watch);
34
BlockZoneWps *wps = bs->wps;
31
- }
35
if (ret == 0) {
32
- }
36
- if ((type & (QEMU_AIO_WRITE | QEMU_AIO_ZONE_APPEND))
33
- }
37
- && wps && bs->bl.zone_size) {
34
-
38
+ if ((type & (QEMU_AIO_WRITE | QEMU_AIO_ZONE_APPEND)) &&
35
while (server->processing_msg) {
39
+ bs->bl.zoned != BLK_Z_NONE) {
36
if (server->ioc->read_coroutine) {
40
uint64_t *wp = &wps->wp[offset / bs->bl.zone_size];
37
server->ioc->read_coroutine = NULL;
41
if (!BDRV_ZT_IS_CONV(*wp)) {
38
@@ -XXX,XX +XXX,XX @@ static void close_client(VuServer *server)
42
if (type & QEMU_AIO_ZONE_APPEND) {
43
@@ -XXX,XX +XXX,XX @@ out:
44
}
45
}
39
}
46
40
47
- if ((type & (QEMU_AIO_WRITE | QEMU_AIO_ZONE_APPEND)) && wps) {
41
vu_deinit(&server->vu_dev);
48
+ if ((type & (QEMU_AIO_WRITE | QEMU_AIO_ZONE_APPEND)) &&
42
+
49
+ bs->blk.zoned != BLK_Z_NONE) {
43
+ /* vu_deinit() should have called remove_watch() */
50
qemu_co_mutex_unlock(&wps->colock);
44
+ assert(QTAILQ_EMPTY(&server->vu_fd_watches));
51
}
45
+
46
object_unref(OBJECT(sioc));
47
object_unref(OBJECT(server->ioc));
52
}
48
}
53
--
49
--
54
2.41.0
50
2.26.2
51
diff view generated by jsdifflib
1
From: zhenwei pi <pizhenwei@bytedance.com>
1
Only one struct is needed per request. Drop req_data and the separate
2
VuBlockReq instance. Instead let vu_queue_pop() allocate everything at
3
once.
2
4
3
'bool is_write' style is obsolete from throttle framework, adapt
5
This fixes the req_data memory leak in vu_block_virtio_process_req().
4
fsdev to the new style.
5
6
6
Cc: Greg Kurz <groug@kaod.org>
7
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
7
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
8
Message-id: 20200924151549.913737-6-stefanha@redhat.com
8
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
9
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
9
Message-Id: <20230728022006.1098509-9-pizhenwei@bytedance.com>
10
Reviewed-by: Greg Kurz <groug@kaod.org>
11
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
12
---
10
---
13
fsdev/qemu-fsdev-throttle.h | 4 ++--
11
block/export/vhost-user-blk-server.c | 68 +++++++++-------------------
14
fsdev/qemu-fsdev-throttle.c | 14 +++++++-------
12
1 file changed, 21 insertions(+), 47 deletions(-)
15
hw/9pfs/cofile.c | 4 ++--
16
3 files changed, 11 insertions(+), 11 deletions(-)
17
13
18
diff --git a/fsdev/qemu-fsdev-throttle.h b/fsdev/qemu-fsdev-throttle.h
14
diff --git a/block/export/vhost-user-blk-server.c b/block/export/vhost-user-blk-server.c
19
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
20
--- a/fsdev/qemu-fsdev-throttle.h
16
--- a/block/export/vhost-user-blk-server.c
21
+++ b/fsdev/qemu-fsdev-throttle.h
17
+++ b/block/export/vhost-user-blk-server.c
22
@@ -XXX,XX +XXX,XX @@ typedef struct FsThrottle {
18
@@ -XXX,XX +XXX,XX @@ struct virtio_blk_inhdr {
23
ThrottleState ts;
19
};
24
ThrottleTimers tt;
20
25
ThrottleConfig cfg;
21
typedef struct VuBlockReq {
26
- CoQueue throttled_reqs[2];
22
- VuVirtqElement *elem;
27
+ CoQueue throttled_reqs[THROTTLE_MAX];
23
+ VuVirtqElement elem;
28
} FsThrottle;
24
int64_t sector_num;
29
25
size_t size;
30
int fsdev_throttle_parse_opts(QemuOpts *, FsThrottle *, Error **);
26
struct virtio_blk_inhdr *in;
31
27
@@ -XXX,XX +XXX,XX @@ static void vu_block_req_complete(VuBlockReq *req)
32
void fsdev_throttle_init(FsThrottle *);
28
VuDev *vu_dev = &req->server->vu_dev;
33
29
34
-void coroutine_fn fsdev_co_throttle_request(FsThrottle *, bool ,
30
/* IO size with 1 extra status byte */
35
+void coroutine_fn fsdev_co_throttle_request(FsThrottle *, ThrottleDirection ,
31
- vu_queue_push(vu_dev, req->vq, req->elem, req->size + 1);
36
struct iovec *, int);
32
+ vu_queue_push(vu_dev, req->vq, &req->elem, req->size + 1);
37
33
vu_queue_notify(vu_dev, req->vq);
38
void fsdev_throttle_cleanup(FsThrottle *);
34
39
diff --git a/fsdev/qemu-fsdev-throttle.c b/fsdev/qemu-fsdev-throttle.c
35
- if (req->elem) {
40
index XXXXXXX..XXXXXXX 100644
36
- free(req->elem);
41
--- a/fsdev/qemu-fsdev-throttle.c
37
- }
42
+++ b/fsdev/qemu-fsdev-throttle.c
38
-
43
@@ -XXX,XX +XXX,XX @@ void fsdev_throttle_init(FsThrottle *fst)
39
- g_free(req);
40
+ free(req);
41
}
42
43
static VuBlockDev *get_vu_block_device_by_server(VuServer *server)
44
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn vu_block_flush(VuBlockReq *req)
45
blk_co_flush(backend);
46
}
47
48
-struct req_data {
49
- VuServer *server;
50
- VuVirtq *vq;
51
- VuVirtqElement *elem;
52
-};
53
-
54
static void coroutine_fn vu_block_virtio_process_req(void *opaque)
55
{
56
- struct req_data *data = opaque;
57
- VuServer *server = data->server;
58
- VuVirtq *vq = data->vq;
59
- VuVirtqElement *elem = data->elem;
60
+ VuBlockReq *req = opaque;
61
+ VuServer *server = req->server;
62
+ VuVirtqElement *elem = &req->elem;
63
uint32_t type;
64
- VuBlockReq *req;
65
66
VuBlockDev *vdev_blk = get_vu_block_device_by_server(server);
67
BlockBackend *backend = vdev_blk->backend;
68
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn vu_block_virtio_process_req(void *opaque)
69
struct iovec *out_iov = elem->out_sg;
70
unsigned in_num = elem->in_num;
71
unsigned out_num = elem->out_num;
72
+
73
/* refer to hw/block/virtio_blk.c */
74
if (elem->out_num < 1 || elem->in_num < 1) {
75
error_report("virtio-blk request missing headers");
76
- free(elem);
77
- return;
78
+ goto err;
79
}
80
81
- req = g_new0(VuBlockReq, 1);
82
- req->server = server;
83
- req->vq = vq;
84
- req->elem = elem;
85
-
86
if (unlikely(iov_to_buf(out_iov, out_num, 0, &req->out,
87
sizeof(req->out)) != sizeof(req->out))) {
88
error_report("virtio-blk request outhdr too short");
89
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn vu_block_virtio_process_req(void *opaque)
90
91
err:
92
free(elem);
93
- g_free(req);
94
- return;
95
}
96
97
static void vu_block_process_vq(VuDev *vu_dev, int idx)
98
{
99
- VuServer *server;
100
- VuVirtq *vq;
101
- struct req_data *req_data;
102
+ VuServer *server = container_of(vu_dev, VuServer, vu_dev);
103
+ VuVirtq *vq = vu_get_queue(vu_dev, idx);
104
105
- server = container_of(vu_dev, VuServer, vu_dev);
106
- assert(server);
107
-
108
- vq = vu_get_queue(vu_dev, idx);
109
- assert(vq);
110
- VuVirtqElement *elem;
111
while (1) {
112
- elem = vu_queue_pop(vu_dev, vq, sizeof(VuVirtqElement) +
113
- sizeof(VuBlockReq));
114
- if (elem) {
115
- req_data = g_new0(struct req_data, 1);
116
- req_data->server = server;
117
- req_data->vq = vq;
118
- req_data->elem = elem;
119
- Coroutine *co = qemu_coroutine_create(vu_block_virtio_process_req,
120
- req_data);
121
- aio_co_enter(server->ioc->ctx, co);
122
- } else {
123
+ VuBlockReq *req;
124
+
125
+ req = vu_queue_pop(vu_dev, vq, sizeof(VuBlockReq));
126
+ if (!req) {
127
break;
128
}
129
+
130
+ req->server = server;
131
+ req->vq = vq;
132
+
133
+ Coroutine *co =
134
+ qemu_coroutine_create(vu_block_virtio_process_req, req);
135
+ qemu_coroutine_enter(co);
44
}
136
}
45
}
137
}
46
138
47
-void coroutine_fn fsdev_co_throttle_request(FsThrottle *fst, bool is_write,
48
+void coroutine_fn fsdev_co_throttle_request(FsThrottle *fst,
49
+ ThrottleDirection direction,
50
struct iovec *iov, int iovcnt)
51
{
52
- ThrottleDirection direction = is_write ? THROTTLE_WRITE : THROTTLE_READ;
53
-
54
+ assert(direction < THROTTLE_MAX);
55
if (throttle_enabled(&fst->cfg)) {
56
if (throttle_schedule_timer(&fst->ts, &fst->tt, direction) ||
57
- !qemu_co_queue_empty(&fst->throttled_reqs[is_write])) {
58
- qemu_co_queue_wait(&fst->throttled_reqs[is_write], NULL);
59
+ !qemu_co_queue_empty(&fst->throttled_reqs[direction])) {
60
+ qemu_co_queue_wait(&fst->throttled_reqs[direction], NULL);
61
}
62
63
throttle_account(&fst->ts, direction, iov_size(iov, iovcnt));
64
65
- if (!qemu_co_queue_empty(&fst->throttled_reqs[is_write]) &&
66
+ if (!qemu_co_queue_empty(&fst->throttled_reqs[direction]) &&
67
!throttle_schedule_timer(&fst->ts, &fst->tt, direction)) {
68
- qemu_co_queue_next(&fst->throttled_reqs[is_write]);
69
+ qemu_co_queue_next(&fst->throttled_reqs[direction]);
70
}
71
}
72
}
73
diff --git a/hw/9pfs/cofile.c b/hw/9pfs/cofile.c
74
index XXXXXXX..XXXXXXX 100644
75
--- a/hw/9pfs/cofile.c
76
+++ b/hw/9pfs/cofile.c
77
@@ -XXX,XX +XXX,XX @@ int coroutine_fn v9fs_co_pwritev(V9fsPDU *pdu, V9fsFidState *fidp,
78
if (v9fs_request_cancelled(pdu)) {
79
return -EINTR;
80
}
81
- fsdev_co_throttle_request(s->ctx.fst, true, iov, iovcnt);
82
+ fsdev_co_throttle_request(s->ctx.fst, THROTTLE_WRITE, iov, iovcnt);
83
v9fs_co_run_in_worker(
84
{
85
err = s->ops->pwritev(&s->ctx, &fidp->fs, iov, iovcnt, offset);
86
@@ -XXX,XX +XXX,XX @@ int coroutine_fn v9fs_co_preadv(V9fsPDU *pdu, V9fsFidState *fidp,
87
if (v9fs_request_cancelled(pdu)) {
88
return -EINTR;
89
}
90
- fsdev_co_throttle_request(s->ctx.fst, false, iov, iovcnt);
91
+ fsdev_co_throttle_request(s->ctx.fst, THROTTLE_READ, iov, iovcnt);
92
v9fs_co_run_in_worker(
93
{
94
err = s->ops->preadv(&s->ctx, &fidp->fs, iov, iovcnt, offset);
95
--
139
--
96
2.41.0
140
2.26.2
141
diff view generated by jsdifflib
New patch
1
The device panic notifier callback is not used. Drop it.
1
2
3
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
4
Message-id: 20200924151549.913737-7-stefanha@redhat.com
5
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
6
---
7
util/vhost-user-server.h | 3 ---
8
block/export/vhost-user-blk-server.c | 3 +--
9
util/vhost-user-server.c | 6 ------
10
3 files changed, 1 insertion(+), 11 deletions(-)
11
12
diff --git a/util/vhost-user-server.h b/util/vhost-user-server.h
13
index XXXXXXX..XXXXXXX 100644
14
--- a/util/vhost-user-server.h
15
+++ b/util/vhost-user-server.h
16
@@ -XXX,XX +XXX,XX @@ typedef struct VuFdWatch {
17
} VuFdWatch;
18
19
typedef struct VuServer VuServer;
20
-typedef void DevicePanicNotifierFn(VuServer *server);
21
22
struct VuServer {
23
QIONetListener *listener;
24
AioContext *ctx;
25
- DevicePanicNotifierFn *device_panic_notifier;
26
int max_queues;
27
const VuDevIface *vu_iface;
28
VuDev vu_dev;
29
@@ -XXX,XX +XXX,XX @@ bool vhost_user_server_start(VuServer *server,
30
SocketAddress *unix_socket,
31
AioContext *ctx,
32
uint16_t max_queues,
33
- DevicePanicNotifierFn *device_panic_notifier,
34
const VuDevIface *vu_iface,
35
Error **errp);
36
37
diff --git a/block/export/vhost-user-blk-server.c b/block/export/vhost-user-blk-server.c
38
index XXXXXXX..XXXXXXX 100644
39
--- a/block/export/vhost-user-blk-server.c
40
+++ b/block/export/vhost-user-blk-server.c
41
@@ -XXX,XX +XXX,XX @@ static void vhost_user_blk_server_start(VuBlockDev *vu_block_device,
42
ctx = bdrv_get_aio_context(blk_bs(vu_block_device->backend));
43
44
if (!vhost_user_server_start(&vu_block_device->vu_server, addr, ctx,
45
- VHOST_USER_BLK_MAX_QUEUES,
46
- NULL, &vu_block_iface,
47
+ VHOST_USER_BLK_MAX_QUEUES, &vu_block_iface,
48
errp)) {
49
goto error;
50
}
51
diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
52
index XXXXXXX..XXXXXXX 100644
53
--- a/util/vhost-user-server.c
54
+++ b/util/vhost-user-server.c
55
@@ -XXX,XX +XXX,XX @@ static void panic_cb(VuDev *vu_dev, const char *buf)
56
close_client(server);
57
}
58
59
- if (server->device_panic_notifier) {
60
- server->device_panic_notifier(server);
61
- }
62
-
63
/*
64
* Set the callback function for network listener so another
65
* vhost-user client can connect to this server
66
@@ -XXX,XX +XXX,XX @@ bool vhost_user_server_start(VuServer *server,
67
SocketAddress *socket_addr,
68
AioContext *ctx,
69
uint16_t max_queues,
70
- DevicePanicNotifierFn *device_panic_notifier,
71
const VuDevIface *vu_iface,
72
Error **errp)
73
{
74
@@ -XXX,XX +XXX,XX @@ bool vhost_user_server_start(VuServer *server,
75
.vu_iface = vu_iface,
76
.max_queues = max_queues,
77
.ctx = ctx,
78
- .device_panic_notifier = device_panic_notifier,
79
};
80
81
qio_net_listener_set_name(server->listener, "vhost-user-backend-listener");
82
--
83
2.26.2
84
diff view generated by jsdifflib
New patch
1
fds[] is leaked when qio_channel_readv_full() fails.
1
2
3
Use vmsg->fds[] instead of keeping a local fds[] array. Then we can
4
reuse goto fail to clean up fds. vmsg->fd_num must be zeroed before the
5
loop to make this safe.
6
7
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
8
Message-id: 20200924151549.913737-8-stefanha@redhat.com
9
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10
---
11
util/vhost-user-server.c | 50 ++++++++++++++++++----------------------
12
1 file changed, 23 insertions(+), 27 deletions(-)
13
14
diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
15
index XXXXXXX..XXXXXXX 100644
16
--- a/util/vhost-user-server.c
17
+++ b/util/vhost-user-server.c
18
@@ -XXX,XX +XXX,XX @@ vu_message_read(VuDev *vu_dev, int conn_fd, VhostUserMsg *vmsg)
19
};
20
int rc, read_bytes = 0;
21
Error *local_err = NULL;
22
- /*
23
- * Store fds/nfds returned from qio_channel_readv_full into
24
- * temporary variables.
25
- *
26
- * VhostUserMsg is a packed structure, gcc will complain about passing
27
- * pointer to a packed structure member if we pass &VhostUserMsg.fd_num
28
- * and &VhostUserMsg.fds directly when calling qio_channel_readv_full,
29
- * thus two temporary variables nfds and fds are used here.
30
- */
31
- size_t nfds = 0, nfds_t = 0;
32
const size_t max_fds = G_N_ELEMENTS(vmsg->fds);
33
- int *fds_t = NULL;
34
VuServer *server = container_of(vu_dev, VuServer, vu_dev);
35
QIOChannel *ioc = server->ioc;
36
37
+ vmsg->fd_num = 0;
38
if (!ioc) {
39
error_report_err(local_err);
40
goto fail;
41
@@ -XXX,XX +XXX,XX @@ vu_message_read(VuDev *vu_dev, int conn_fd, VhostUserMsg *vmsg)
42
43
assert(qemu_in_coroutine());
44
do {
45
+ size_t nfds = 0;
46
+ int *fds = NULL;
47
+
48
/*
49
* qio_channel_readv_full may have short reads, keeping calling it
50
* until getting VHOST_USER_HDR_SIZE or 0 bytes in total
51
*/
52
- rc = qio_channel_readv_full(ioc, &iov, 1, &fds_t, &nfds_t, &local_err);
53
+ rc = qio_channel_readv_full(ioc, &iov, 1, &fds, &nfds, &local_err);
54
if (rc < 0) {
55
if (rc == QIO_CHANNEL_ERR_BLOCK) {
56
+ assert(local_err == NULL);
57
qio_channel_yield(ioc, G_IO_IN);
58
continue;
59
} else {
60
error_report_err(local_err);
61
- return false;
62
+ goto fail;
63
}
64
}
65
- read_bytes += rc;
66
- if (nfds_t > 0) {
67
- if (nfds + nfds_t > max_fds) {
68
+
69
+ if (nfds > 0) {
70
+ if (vmsg->fd_num + nfds > max_fds) {
71
error_report("A maximum of %zu fds are allowed, "
72
"however got %zu fds now",
73
- max_fds, nfds + nfds_t);
74
+ max_fds, vmsg->fd_num + nfds);
75
+ g_free(fds);
76
goto fail;
77
}
78
- memcpy(vmsg->fds + nfds, fds_t,
79
- nfds_t *sizeof(vmsg->fds[0]));
80
- nfds += nfds_t;
81
- g_free(fds_t);
82
+ memcpy(vmsg->fds + vmsg->fd_num, fds, nfds * sizeof(vmsg->fds[0]));
83
+ vmsg->fd_num += nfds;
84
+ g_free(fds);
85
}
86
- if (read_bytes == VHOST_USER_HDR_SIZE || rc == 0) {
87
- break;
88
+
89
+ if (rc == 0) { /* socket closed */
90
+ goto fail;
91
}
92
- iov.iov_base = (char *)vmsg + read_bytes;
93
- iov.iov_len = VHOST_USER_HDR_SIZE - read_bytes;
94
- } while (true);
95
96
- vmsg->fd_num = nfds;
97
+ iov.iov_base += rc;
98
+ iov.iov_len -= rc;
99
+ read_bytes += rc;
100
+ } while (read_bytes != VHOST_USER_HDR_SIZE);
101
+
102
/* qio_channel_readv_full will make socket fds blocking, unblock them */
103
vmsg_unblock_fds(vmsg);
104
if (vmsg->size > sizeof(vmsg->payload)) {
105
--
106
2.26.2
107
diff view generated by jsdifflib
1
We must check that zone information is present before running
1
Unexpected EOF is an error that must be reported.
2
update_zones_wp().
3
2
4
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2234374
3
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
5
Fixes: Coverity CID 1512459
4
Message-id: 20200924151549.913737-9-stefanha@redhat.com
6
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
5
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
7
Message-Id: <20230824155345.109765-4-hreitz@redhat.com>
8
Reviewed-by: Sam Li <faithilikerun@gmail.com>
9
---
6
---
10
block/file-posix.c | 3 ++-
7
util/vhost-user-server.c | 6 ++++--
11
1 file changed, 2 insertions(+), 1 deletion(-)
8
1 file changed, 4 insertions(+), 2 deletions(-)
12
9
13
diff --git a/block/file-posix.c b/block/file-posix.c
10
diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
14
index XXXXXXX..XXXXXXX 100644
11
index XXXXXXX..XXXXXXX 100644
15
--- a/block/file-posix.c
12
--- a/util/vhost-user-server.c
16
+++ b/block/file-posix.c
13
+++ b/util/vhost-user-server.c
17
@@ -XXX,XX +XXX,XX @@ out:
14
@@ -XXX,XX +XXX,XX @@ vu_message_read(VuDev *vu_dev, int conn_fd, VhostUserMsg *vmsg)
18
}
15
};
19
}
16
if (vmsg->size) {
20
} else {
17
rc = qio_channel_readv_all_eof(ioc, &iov_payload, 1, &local_err);
21
- if (type & (QEMU_AIO_WRITE | QEMU_AIO_ZONE_APPEND)) {
18
- if (rc == -1) {
22
+ if ((type & (QEMU_AIO_WRITE | QEMU_AIO_ZONE_APPEND)) &&
19
- error_report_err(local_err);
23
+ bs->bl.zoned != BLK_Z_NONE) {
20
+ if (rc != 1) {
24
update_zones_wp(bs, s->fd, 0, 1);
21
+ if (local_err) {
22
+ error_report_err(local_err);
23
+ }
24
goto fail;
25
}
25
}
26
}
26
}
27
--
27
--
28
2.41.0
28
2.26.2
29
diff view generated by jsdifflib
New patch
1
The vu_client_trip() coroutine is leaked during AioContext switching. It
2
is also unsafe to destroy the vu_dev in panic_cb() since its callers
3
still access it in some cases.
1
4
5
Rework the lifecycle to solve these safety issues.
6
7
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
8
Message-id: 20200924151549.913737-10-stefanha@redhat.com
9
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10
---
11
util/vhost-user-server.h | 29 ++--
12
block/export/vhost-user-blk-server.c | 9 +-
13
util/vhost-user-server.c | 245 +++++++++++++++------------
14
3 files changed, 155 insertions(+), 128 deletions(-)
15
16
diff --git a/util/vhost-user-server.h b/util/vhost-user-server.h
17
index XXXXXXX..XXXXXXX 100644
18
--- a/util/vhost-user-server.h
19
+++ b/util/vhost-user-server.h
20
@@ -XXX,XX +XXX,XX @@
21
#include "qapi/error.h"
22
#include "standard-headers/linux/virtio_blk.h"
23
24
+/* A kick fd that we monitor on behalf of libvhost-user */
25
typedef struct VuFdWatch {
26
VuDev *vu_dev;
27
int fd; /*kick fd*/
28
void *pvt;
29
vu_watch_cb cb;
30
- bool processing;
31
QTAILQ_ENTRY(VuFdWatch) next;
32
} VuFdWatch;
33
34
-typedef struct VuServer VuServer;
35
-
36
-struct VuServer {
37
+/**
38
+ * VuServer:
39
+ * A vhost-user server instance with user-defined VuDevIface callbacks.
40
+ * Vhost-user device backends can be implemented using VuServer. VuDevIface
41
+ * callbacks and virtqueue kicks run in the given AioContext.
42
+ */
43
+typedef struct {
44
QIONetListener *listener;
45
+ QEMUBH *restart_listener_bh;
46
AioContext *ctx;
47
int max_queues;
48
const VuDevIface *vu_iface;
49
+
50
+ /* Protected by ctx lock */
51
VuDev vu_dev;
52
QIOChannel *ioc; /* The I/O channel with the client */
53
QIOChannelSocket *sioc; /* The underlying data channel with the client */
54
- /* IOChannel for fd provided via VHOST_USER_SET_SLAVE_REQ_FD */
55
- QIOChannel *ioc_slave;
56
- QIOChannelSocket *sioc_slave;
57
- Coroutine *co_trip; /* coroutine for processing VhostUserMsg */
58
QTAILQ_HEAD(, VuFdWatch) vu_fd_watches;
59
- /* restart coroutine co_trip if AIOContext is changed */
60
- bool aio_context_changed;
61
- bool processing_msg;
62
-};
63
+
64
+ Coroutine *co_trip; /* coroutine for processing VhostUserMsg */
65
+} VuServer;
66
67
bool vhost_user_server_start(VuServer *server,
68
SocketAddress *unix_socket,
69
@@ -XXX,XX +XXX,XX @@ bool vhost_user_server_start(VuServer *server,
70
71
void vhost_user_server_stop(VuServer *server);
72
73
-void vhost_user_server_set_aio_context(VuServer *server, AioContext *ctx);
74
+void vhost_user_server_attach_aio_context(VuServer *server, AioContext *ctx);
75
+void vhost_user_server_detach_aio_context(VuServer *server);
76
77
#endif /* VHOST_USER_SERVER_H */
78
diff --git a/block/export/vhost-user-blk-server.c b/block/export/vhost-user-blk-server.c
79
index XXXXXXX..XXXXXXX 100644
80
--- a/block/export/vhost-user-blk-server.c
81
+++ b/block/export/vhost-user-blk-server.c
82
@@ -XXX,XX +XXX,XX @@ static const VuDevIface vu_block_iface = {
83
static void blk_aio_attached(AioContext *ctx, void *opaque)
84
{
85
VuBlockDev *vub_dev = opaque;
86
- aio_context_acquire(ctx);
87
- vhost_user_server_set_aio_context(&vub_dev->vu_server, ctx);
88
- aio_context_release(ctx);
89
+ vhost_user_server_attach_aio_context(&vub_dev->vu_server, ctx);
90
}
91
92
static void blk_aio_detach(void *opaque)
93
{
94
VuBlockDev *vub_dev = opaque;
95
- AioContext *ctx = vub_dev->vu_server.ctx;
96
- aio_context_acquire(ctx);
97
- vhost_user_server_set_aio_context(&vub_dev->vu_server, NULL);
98
- aio_context_release(ctx);
99
+ vhost_user_server_detach_aio_context(&vub_dev->vu_server);
100
}
101
102
static void
103
diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
104
index XXXXXXX..XXXXXXX 100644
105
--- a/util/vhost-user-server.c
106
+++ b/util/vhost-user-server.c
107
@@ -XXX,XX +XXX,XX @@
108
*/
109
#include "qemu/osdep.h"
110
#include "qemu/main-loop.h"
111
+#include "block/aio-wait.h"
112
#include "vhost-user-server.h"
113
114
+/*
115
+ * Theory of operation:
116
+ *
117
+ * VuServer is started and stopped by vhost_user_server_start() and
118
+ * vhost_user_server_stop() from the main loop thread. Starting the server
119
+ * opens a vhost-user UNIX domain socket and listens for incoming connections.
120
+ * Only one connection is allowed at a time.
121
+ *
122
+ * The connection is handled by the vu_client_trip() coroutine in the
123
+ * VuServer->ctx AioContext. The coroutine consists of a vu_dispatch() loop
124
+ * where libvhost-user calls vu_message_read() to receive the next vhost-user
125
+ * protocol messages over the UNIX domain socket.
126
+ *
127
+ * When virtqueues are set up libvhost-user calls set_watch() to monitor kick
128
+ * fds. These fds are also handled in the VuServer->ctx AioContext.
129
+ *
130
+ * Both vu_client_trip() and kick fd monitoring can be stopped by shutting down
131
+ * the socket connection. Shutting down the socket connection causes
132
+ * vu_message_read() to fail since no more data can be received from the socket.
133
+ * After vu_dispatch() fails, vu_client_trip() calls vu_deinit() to stop
134
+ * libvhost-user before terminating the coroutine. vu_deinit() calls
135
+ * remove_watch() to stop monitoring kick fds and this stops virtqueue
136
+ * processing.
137
+ *
138
+ * When vu_client_trip() has finished cleaning up it schedules a BH in the main
139
+ * loop thread to accept the next client connection.
140
+ *
141
+ * When libvhost-user detects an error it calls panic_cb() and sets the
142
+ * dev->broken flag. Both vu_client_trip() and kick fd processing stop when
143
+ * the dev->broken flag is set.
144
+ *
145
+ * It is possible to switch AioContexts using
146
+ * vhost_user_server_detach_aio_context() and
147
+ * vhost_user_server_attach_aio_context(). They stop monitoring fds in the old
148
+ * AioContext and resume monitoring in the new AioContext. The vu_client_trip()
149
+ * coroutine remains in a yielded state during the switch. This is made
150
+ * possible by QIOChannel's support for spurious coroutine re-entry in
151
+ * qio_channel_yield(). The coroutine will restart I/O when re-entered from the
152
+ * new AioContext.
153
+ */
154
+
155
static void vmsg_close_fds(VhostUserMsg *vmsg)
156
{
157
int i;
158
@@ -XXX,XX +XXX,XX @@ static void vmsg_unblock_fds(VhostUserMsg *vmsg)
159
}
160
}
161
162
-static void vu_accept(QIONetListener *listener, QIOChannelSocket *sioc,
163
- gpointer opaque);
164
-
165
-static void close_client(VuServer *server)
166
-{
167
- /*
168
- * Before closing the client
169
- *
170
- * 1. Let vu_client_trip stop processing new vhost-user msg
171
- *
172
- * 2. remove kick_handler
173
- *
174
- * 3. wait for the kick handler to be finished
175
- *
176
- * 4. wait for the current vhost-user msg to be finished processing
177
- */
178
-
179
- QIOChannelSocket *sioc = server->sioc;
180
- /* When this is set vu_client_trip will stop new processing vhost-user message */
181
- server->sioc = NULL;
182
-
183
- while (server->processing_msg) {
184
- if (server->ioc->read_coroutine) {
185
- server->ioc->read_coroutine = NULL;
186
- qio_channel_set_aio_fd_handler(server->ioc, server->ioc->ctx, NULL,
187
- NULL, server->ioc);
188
- server->processing_msg = false;
189
- }
190
- }
191
-
192
- vu_deinit(&server->vu_dev);
193
-
194
- /* vu_deinit() should have called remove_watch() */
195
- assert(QTAILQ_EMPTY(&server->vu_fd_watches));
196
-
197
- object_unref(OBJECT(sioc));
198
- object_unref(OBJECT(server->ioc));
199
-}
200
-
201
static void panic_cb(VuDev *vu_dev, const char *buf)
202
{
203
- VuServer *server = container_of(vu_dev, VuServer, vu_dev);
204
-
205
- /* avoid while loop in close_client */
206
- server->processing_msg = false;
207
-
208
- if (buf) {
209
- error_report("vu_panic: %s", buf);
210
- }
211
-
212
- if (server->sioc) {
213
- close_client(server);
214
- }
215
-
216
- /*
217
- * Set the callback function for network listener so another
218
- * vhost-user client can connect to this server
219
- */
220
- qio_net_listener_set_client_func(server->listener,
221
- vu_accept,
222
- server,
223
- NULL);
224
+ error_report("vu_panic: %s", buf);
225
}
226
227
static bool coroutine_fn
228
@@ -XXX,XX +XXX,XX @@ fail:
229
return false;
230
}
231
232
-
233
-static void vu_client_start(VuServer *server);
234
static coroutine_fn void vu_client_trip(void *opaque)
235
{
236
VuServer *server = opaque;
237
+ VuDev *vu_dev = &server->vu_dev;
238
239
- while (!server->aio_context_changed && server->sioc) {
240
- server->processing_msg = true;
241
- vu_dispatch(&server->vu_dev);
242
- server->processing_msg = false;
243
+ while (!vu_dev->broken && vu_dispatch(vu_dev)) {
244
+ /* Keep running */
245
}
246
247
- if (server->aio_context_changed && server->sioc) {
248
- server->aio_context_changed = false;
249
- vu_client_start(server);
250
- }
251
-}
252
+ vu_deinit(vu_dev);
253
+
254
+ /* vu_deinit() should have called remove_watch() */
255
+ assert(QTAILQ_EMPTY(&server->vu_fd_watches));
256
+
257
+ object_unref(OBJECT(server->sioc));
258
+ server->sioc = NULL;
259
260
-static void vu_client_start(VuServer *server)
261
-{
262
- server->co_trip = qemu_coroutine_create(vu_client_trip, server);
263
- aio_co_enter(server->ctx, server->co_trip);
264
+ object_unref(OBJECT(server->ioc));
265
+ server->ioc = NULL;
266
+
267
+ server->co_trip = NULL;
268
+ if (server->restart_listener_bh) {
269
+ qemu_bh_schedule(server->restart_listener_bh);
270
+ }
271
+ aio_wait_kick();
272
}
273
274
/*
275
@@ -XXX,XX +XXX,XX @@ static void vu_client_start(VuServer *server)
276
static void kick_handler(void *opaque)
277
{
278
VuFdWatch *vu_fd_watch = opaque;
279
- vu_fd_watch->processing = true;
280
- vu_fd_watch->cb(vu_fd_watch->vu_dev, 0, vu_fd_watch->pvt);
281
- vu_fd_watch->processing = false;
282
+ VuDev *vu_dev = vu_fd_watch->vu_dev;
283
+
284
+ vu_fd_watch->cb(vu_dev, 0, vu_fd_watch->pvt);
285
+
286
+ /* Stop vu_client_trip() if an error occurred in vu_fd_watch->cb() */
287
+ if (vu_dev->broken) {
288
+ VuServer *server = container_of(vu_dev, VuServer, vu_dev);
289
+
290
+ qio_channel_shutdown(server->ioc, QIO_CHANNEL_SHUTDOWN_BOTH, NULL);
291
+ }
292
}
293
294
-
295
static VuFdWatch *find_vu_fd_watch(VuServer *server, int fd)
296
{
297
298
@@ -XXX,XX +XXX,XX @@ static void vu_accept(QIONetListener *listener, QIOChannelSocket *sioc,
299
qio_channel_set_name(QIO_CHANNEL(sioc), "vhost-user client");
300
server->ioc = QIO_CHANNEL(sioc);
301
object_ref(OBJECT(server->ioc));
302
- qio_channel_attach_aio_context(server->ioc, server->ctx);
303
+
304
+ /* TODO vu_message_write() spins if non-blocking! */
305
qio_channel_set_blocking(server->ioc, false, NULL);
306
- vu_client_start(server);
307
+
308
+ server->co_trip = qemu_coroutine_create(vu_client_trip, server);
309
+
310
+ aio_context_acquire(server->ctx);
311
+ vhost_user_server_attach_aio_context(server, server->ctx);
312
+ aio_context_release(server->ctx);
313
}
314
315
-
316
void vhost_user_server_stop(VuServer *server)
317
{
318
+ aio_context_acquire(server->ctx);
319
+
320
+ qemu_bh_delete(server->restart_listener_bh);
321
+ server->restart_listener_bh = NULL;
322
+
323
if (server->sioc) {
324
- close_client(server);
325
+ VuFdWatch *vu_fd_watch;
326
+
327
+ QTAILQ_FOREACH(vu_fd_watch, &server->vu_fd_watches, next) {
328
+ aio_set_fd_handler(server->ctx, vu_fd_watch->fd, true,
329
+ NULL, NULL, NULL, vu_fd_watch);
330
+ }
331
+
332
+ qio_channel_shutdown(server->ioc, QIO_CHANNEL_SHUTDOWN_BOTH, NULL);
333
+
334
+ AIO_WAIT_WHILE(server->ctx, server->co_trip);
335
}
336
337
+ aio_context_release(server->ctx);
338
+
339
if (server->listener) {
340
qio_net_listener_disconnect(server->listener);
341
object_unref(OBJECT(server->listener));
342
}
343
+}
344
+
345
+/*
346
+ * Allow the next client to connect to the server. Called from a BH in the main
347
+ * loop.
348
+ */
349
+static void restart_listener_bh(void *opaque)
350
+{
351
+ VuServer *server = opaque;
352
353
+ qio_net_listener_set_client_func(server->listener, vu_accept, server,
354
+ NULL);
355
}
356
357
-void vhost_user_server_set_aio_context(VuServer *server, AioContext *ctx)
358
+/* Called with ctx acquired */
359
+void vhost_user_server_attach_aio_context(VuServer *server, AioContext *ctx)
360
{
361
- VuFdWatch *vu_fd_watch, *next;
362
- void *opaque = NULL;
363
- IOHandler *io_read = NULL;
364
- bool attach;
365
+ VuFdWatch *vu_fd_watch;
366
367
- server->ctx = ctx ? ctx : qemu_get_aio_context();
368
+ server->ctx = ctx;
369
370
if (!server->sioc) {
371
- /* not yet serving any client*/
372
return;
373
}
374
375
- if (ctx) {
376
- qio_channel_attach_aio_context(server->ioc, ctx);
377
- server->aio_context_changed = true;
378
- io_read = kick_handler;
379
- attach = true;
380
- } else {
381
+ qio_channel_attach_aio_context(server->ioc, ctx);
382
+
383
+ QTAILQ_FOREACH(vu_fd_watch, &server->vu_fd_watches, next) {
384
+ aio_set_fd_handler(ctx, vu_fd_watch->fd, true, kick_handler, NULL,
385
+ NULL, vu_fd_watch);
386
+ }
387
+
388
+ aio_co_schedule(ctx, server->co_trip);
389
+}
390
+
391
+/* Called with server->ctx acquired */
392
+void vhost_user_server_detach_aio_context(VuServer *server)
393
+{
394
+ if (server->sioc) {
395
+ VuFdWatch *vu_fd_watch;
396
+
397
+ QTAILQ_FOREACH(vu_fd_watch, &server->vu_fd_watches, next) {
398
+ aio_set_fd_handler(server->ctx, vu_fd_watch->fd, true,
399
+ NULL, NULL, NULL, vu_fd_watch);
400
+ }
401
+
402
qio_channel_detach_aio_context(server->ioc);
403
- /* server->ioc->ctx keeps the old AioConext */
404
- ctx = server->ioc->ctx;
405
- attach = false;
406
}
407
408
- QTAILQ_FOREACH_SAFE(vu_fd_watch, &server->vu_fd_watches, next, next) {
409
- if (vu_fd_watch->cb) {
410
- opaque = attach ? vu_fd_watch : NULL;
411
- aio_set_fd_handler(ctx, vu_fd_watch->fd, true,
412
- io_read, NULL, NULL,
413
- opaque);
414
- }
415
- }
416
+ server->ctx = NULL;
417
}
418
419
-
420
bool vhost_user_server_start(VuServer *server,
421
SocketAddress *socket_addr,
422
AioContext *ctx,
423
@@ -XXX,XX +XXX,XX @@ bool vhost_user_server_start(VuServer *server,
424
const VuDevIface *vu_iface,
425
Error **errp)
426
{
427
+ QEMUBH *bh;
428
QIONetListener *listener = qio_net_listener_new();
429
if (qio_net_listener_open_sync(listener, socket_addr, 1,
430
errp) < 0) {
431
@@ -XXX,XX +XXX,XX @@ bool vhost_user_server_start(VuServer *server,
432
return false;
433
}
434
435
+ bh = qemu_bh_new(restart_listener_bh, server);
436
+
437
/* zero out unspecified fields */
438
*server = (VuServer) {
439
.listener = listener,
440
+ .restart_listener_bh = bh,
441
.vu_iface = vu_iface,
442
.max_queues = max_queues,
443
.ctx = ctx,
444
--
445
2.26.2
446
diff view generated by jsdifflib
New patch
1
Propagate the flush return value since errors are possible.
1
2
3
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
4
Message-id: 20200924151549.913737-11-stefanha@redhat.com
5
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
6
---
7
block/export/vhost-user-blk-server.c | 11 +++++++----
8
1 file changed, 7 insertions(+), 4 deletions(-)
9
10
diff --git a/block/export/vhost-user-blk-server.c b/block/export/vhost-user-blk-server.c
11
index XXXXXXX..XXXXXXX 100644
12
--- a/block/export/vhost-user-blk-server.c
13
+++ b/block/export/vhost-user-blk-server.c
14
@@ -XXX,XX +XXX,XX @@ vu_block_discard_write_zeroes(VuBlockReq *req, struct iovec *iov,
15
return -EINVAL;
16
}
17
18
-static void coroutine_fn vu_block_flush(VuBlockReq *req)
19
+static int coroutine_fn vu_block_flush(VuBlockReq *req)
20
{
21
VuBlockDev *vdev_blk = get_vu_block_device_by_server(req->server);
22
BlockBackend *backend = vdev_blk->backend;
23
- blk_co_flush(backend);
24
+ return blk_co_flush(backend);
25
}
26
27
static void coroutine_fn vu_block_virtio_process_req(void *opaque)
28
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn vu_block_virtio_process_req(void *opaque)
29
break;
30
}
31
case VIRTIO_BLK_T_FLUSH:
32
- vu_block_flush(req);
33
- req->in->status = VIRTIO_BLK_S_OK;
34
+ if (vu_block_flush(req) == 0) {
35
+ req->in->status = VIRTIO_BLK_S_OK;
36
+ } else {
37
+ req->in->status = VIRTIO_BLK_S_IOERR;
38
+ }
39
break;
40
case VIRTIO_BLK_T_GET_ID: {
41
size_t size = MIN(iov_size(&elem->in_sg[0], in_num),
42
--
43
2.26.2
44
diff view generated by jsdifflib
1
From: zhenwei pi <pizhenwei@bytedance.com>
1
Use the new QAPI block exports API instead of defining our own QOM
2
2
objects.
3
enum ThrottleDirection is already there, use ThrottleDirection instead
3
4
of 'bool is_write' for throttle API, also modify related codes from
4
This is a large change because the lifecycle of VuBlockDev needs to
5
block, fsdev, cryptodev and tests.
5
follow BlockExportDriver. QOM properties are replaced by QAPI options
6
6
objects.
7
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
7
8
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
8
VuBlockDev is renamed VuBlkExport and contains a BlockExport field.
9
Message-Id: <20230728022006.1098509-7-pizhenwei@bytedance.com>
9
Several fields can be dropped since BlockExport already has equivalents.
10
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
10
11
The file names and meson build integration will be adjusted in a future
12
patch. libvhost-user should probably be built as a static library that
13
is linked into QEMU instead of as a .c file that results in duplicate
14
compilation.
15
16
The new command-line syntax is:
17
18
$ qemu-storage-daemon \
19
--blockdev file,node-name=drive0,filename=test.img \
20
--export vhost-user-blk,node-name=drive0,id=export0,unix-socket=/tmp/vhost-user-blk.sock
21
22
Note that unix-socket is optional because we may wish to accept chardevs
23
too in the future.
24
25
Markus noted that supported address families are not explicit in the
26
QAPI schema. It is unlikely that support for more address families will
27
be added since file descriptor passing is required and few address
28
families support it. If a new address family needs to be added, then the
29
QAPI 'features' syntax can be used to advertize them.
30
31
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
32
Acked-by: Markus Armbruster <armbru@redhat.com>
33
Message-id: 20200924151549.913737-12-stefanha@redhat.com
34
[Skip test on big-endian host architectures because this device doesn't
35
support them yet (as already mentioned in a code comment).
36
--Stefan]
37
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
11
---
38
---
12
include/qemu/throttle.h | 5 +++--
39
qapi/block-export.json | 21 +-
13
backends/cryptodev.c | 9 +++++----
40
block/export/vhost-user-blk-server.h | 23 +-
14
block/throttle-groups.c | 6 ++++--
41
block/export/export.c | 6 +
15
fsdev/qemu-fsdev-throttle.c | 8 +++++---
42
block/export/vhost-user-blk-server.c | 452 +++++++--------------------
16
tests/unit/test-throttle.c | 4 ++--
43
util/vhost-user-server.c | 10 +-
17
util/throttle.c | 31 +++++++++++++++++--------------
44
block/export/meson.build | 1 +
18
6 files changed, 36 insertions(+), 27 deletions(-)
45
block/meson.build | 1 -
19
46
7 files changed, 156 insertions(+), 358 deletions(-)
20
diff --git a/include/qemu/throttle.h b/include/qemu/throttle.h
47
48
diff --git a/qapi/block-export.json b/qapi/block-export.json
21
index XXXXXXX..XXXXXXX 100644
49
index XXXXXXX..XXXXXXX 100644
22
--- a/include/qemu/throttle.h
50
--- a/qapi/block-export.json
23
+++ b/include/qemu/throttle.h
51
+++ b/qapi/block-export.json
24
@@ -XXX,XX +XXX,XX @@ void throttle_config_init(ThrottleConfig *cfg);
52
@@ -XXX,XX +XXX,XX @@
25
/* usage */
53
'data': { '*name': 'str', '*description': 'str',
26
bool throttle_schedule_timer(ThrottleState *ts,
54
'*bitmap': 'str' } }
27
ThrottleTimers *tt,
55
28
- bool is_write);
56
+##
29
+ ThrottleDirection direction);
57
+# @BlockExportOptionsVhostUserBlk:
30
58
+#
31
-void throttle_account(ThrottleState *ts, bool is_write, uint64_t size);
59
+# A vhost-user-blk block export.
32
+void throttle_account(ThrottleState *ts, ThrottleDirection direction,
60
+#
33
+ uint64_t size);
61
+# @addr: The vhost-user socket on which to listen. Both 'unix' and 'fd'
34
void throttle_limits_to_config(ThrottleLimits *arg, ThrottleConfig *cfg,
62
+# SocketAddress types are supported. Passed fds must be UNIX domain
35
Error **errp);
63
+# sockets.
36
void throttle_config_to_limits(ThrottleConfig *cfg, ThrottleLimits *var);
64
+# @logical-block-size: Logical block size in bytes. Defaults to 512 bytes.
37
diff --git a/backends/cryptodev.c b/backends/cryptodev.c
65
+#
66
+# Since: 5.2
67
+##
68
+{ 'struct': 'BlockExportOptionsVhostUserBlk',
69
+ 'data': { 'addr': 'SocketAddress', '*logical-block-size': 'size' } }
70
+
71
##
72
# @NbdServerAddOptions:
73
#
74
@@ -XXX,XX +XXX,XX @@
75
# An enumeration of block export types
76
#
77
# @nbd: NBD export
78
+# @vhost-user-blk: vhost-user-blk export (since 5.2)
79
#
80
# Since: 4.2
81
##
82
{ 'enum': 'BlockExportType',
83
- 'data': [ 'nbd' ] }
84
+ 'data': [ 'nbd', 'vhost-user-blk' ] }
85
86
##
87
# @BlockExportOptions:
88
@@ -XXX,XX +XXX,XX @@
89
'*writethrough': 'bool' },
90
'discriminator': 'type',
91
'data': {
92
- 'nbd': 'BlockExportOptionsNbd'
93
+ 'nbd': 'BlockExportOptionsNbd',
94
+ 'vhost-user-blk': 'BlockExportOptionsVhostUserBlk'
95
} }
96
97
##
98
diff --git a/block/export/vhost-user-blk-server.h b/block/export/vhost-user-blk-server.h
38
index XXXXXXX..XXXXXXX 100644
99
index XXXXXXX..XXXXXXX 100644
39
--- a/backends/cryptodev.c
100
--- a/block/export/vhost-user-blk-server.h
40
+++ b/backends/cryptodev.c
101
+++ b/block/export/vhost-user-blk-server.h
41
@@ -XXX,XX +XXX,XX @@ static void cryptodev_backend_throttle_timer_cb(void *opaque)
102
@@ -XXX,XX +XXX,XX @@
42
continue;
103
104
#ifndef VHOST_USER_BLK_SERVER_H
105
#define VHOST_USER_BLK_SERVER_H
106
-#include "util/vhost-user-server.h"
107
108
-typedef struct VuBlockDev VuBlockDev;
109
-#define TYPE_VHOST_USER_BLK_SERVER "vhost-user-blk-server"
110
-#define VHOST_USER_BLK_SERVER(obj) \
111
- OBJECT_CHECK(VuBlockDev, obj, TYPE_VHOST_USER_BLK_SERVER)
112
+#include "block/export.h"
113
114
-/* vhost user block device */
115
-struct VuBlockDev {
116
- Object parent_obj;
117
- char *node_name;
118
- SocketAddress *addr;
119
- AioContext *ctx;
120
- VuServer vu_server;
121
- bool running;
122
- uint32_t blk_size;
123
- BlockBackend *backend;
124
- QIOChannelSocket *sioc;
125
- QTAILQ_ENTRY(VuBlockDev) next;
126
- struct virtio_blk_config blkcfg;
127
- bool writable;
128
-};
129
+/* For block/export/export.c */
130
+extern const BlockExportDriver blk_exp_vhost_user_blk;
131
132
#endif /* VHOST_USER_BLK_SERVER_H */
133
diff --git a/block/export/export.c b/block/export/export.c
134
index XXXXXXX..XXXXXXX 100644
135
--- a/block/export/export.c
136
+++ b/block/export/export.c
137
@@ -XXX,XX +XXX,XX @@
138
#include "sysemu/block-backend.h"
139
#include "block/export.h"
140
#include "block/nbd.h"
141
+#if CONFIG_LINUX
142
+#include "block/export/vhost-user-blk-server.h"
143
+#endif
144
#include "qapi/error.h"
145
#include "qapi/qapi-commands-block-export.h"
146
#include "qapi/qapi-events-block-export.h"
147
@@ -XXX,XX +XXX,XX @@
148
149
static const BlockExportDriver *blk_exp_drivers[] = {
150
&blk_exp_nbd,
151
+#if CONFIG_LINUX
152
+ &blk_exp_vhost_user_blk,
153
+#endif
154
};
155
156
/* Only accessed from the main thread */
157
diff --git a/block/export/vhost-user-blk-server.c b/block/export/vhost-user-blk-server.c
158
index XXXXXXX..XXXXXXX 100644
159
--- a/block/export/vhost-user-blk-server.c
160
+++ b/block/export/vhost-user-blk-server.c
161
@@ -XXX,XX +XXX,XX @@
162
*/
163
#include "qemu/osdep.h"
164
#include "block/block.h"
165
+#include "contrib/libvhost-user/libvhost-user.h"
166
+#include "standard-headers/linux/virtio_blk.h"
167
+#include "util/vhost-user-server.h"
168
#include "vhost-user-blk-server.h"
169
#include "qapi/error.h"
170
#include "qom/object_interfaces.h"
171
@@ -XXX,XX +XXX,XX @@ struct virtio_blk_inhdr {
172
unsigned char status;
173
};
174
175
-typedef struct VuBlockReq {
176
+typedef struct VuBlkReq {
177
VuVirtqElement elem;
178
int64_t sector_num;
179
size_t size;
180
@@ -XXX,XX +XXX,XX @@ typedef struct VuBlockReq {
181
struct virtio_blk_outhdr out;
182
VuServer *server;
183
struct VuVirtq *vq;
184
-} VuBlockReq;
185
+} VuBlkReq;
186
187
-static void vu_block_req_complete(VuBlockReq *req)
188
+/* vhost user block device */
189
+typedef struct {
190
+ BlockExport export;
191
+ VuServer vu_server;
192
+ uint32_t blk_size;
193
+ QIOChannelSocket *sioc;
194
+ struct virtio_blk_config blkcfg;
195
+ bool writable;
196
+} VuBlkExport;
197
+
198
+static void vu_blk_req_complete(VuBlkReq *req)
199
{
200
VuDev *vu_dev = &req->server->vu_dev;
201
202
@@ -XXX,XX +XXX,XX @@ static void vu_block_req_complete(VuBlockReq *req)
203
free(req);
204
}
205
206
-static VuBlockDev *get_vu_block_device_by_server(VuServer *server)
207
-{
208
- return container_of(server, VuBlockDev, vu_server);
209
-}
210
-
211
static int coroutine_fn
212
-vu_block_discard_write_zeroes(VuBlockReq *req, struct iovec *iov,
213
- uint32_t iovcnt, uint32_t type)
214
+vu_blk_discard_write_zeroes(BlockBackend *blk, struct iovec *iov,
215
+ uint32_t iovcnt, uint32_t type)
216
{
217
struct virtio_blk_discard_write_zeroes desc;
218
ssize_t size = iov_to_buf(iov, iovcnt, 0, &desc, sizeof(desc));
219
@@ -XXX,XX +XXX,XX @@ vu_block_discard_write_zeroes(VuBlockReq *req, struct iovec *iov,
220
return -EINVAL;
221
}
222
223
- VuBlockDev *vdev_blk = get_vu_block_device_by_server(req->server);
224
uint64_t range[2] = { le64_to_cpu(desc.sector) << 9,
225
le32_to_cpu(desc.num_sectors) << 9 };
226
if (type == VIRTIO_BLK_T_DISCARD) {
227
- if (blk_co_pdiscard(vdev_blk->backend, range[0], range[1]) == 0) {
228
+ if (blk_co_pdiscard(blk, range[0], range[1]) == 0) {
229
return 0;
43
}
230
}
44
231
} else if (type == VIRTIO_BLK_T_WRITE_ZEROES) {
45
- throttle_account(&backend->ts, true, ret);
232
- if (blk_co_pwrite_zeroes(vdev_blk->backend,
46
+ throttle_account(&backend->ts, THROTTLE_WRITE, ret);
233
- range[0], range[1], 0) == 0) {
47
cryptodev_backend_operation(backend, op_info);
234
+ if (blk_co_pwrite_zeroes(blk, range[0], range[1], 0) == 0) {
48
if (throttle_enabled(&backend->tc) &&
235
return 0;
49
- throttle_schedule_timer(&backend->ts, &backend->tt, true)) {
236
}
50
+ throttle_schedule_timer(&backend->ts, &backend->tt,
237
}
51
+ THROTTLE_WRITE)) {
238
@@ -XXX,XX +XXX,XX @@ vu_block_discard_write_zeroes(VuBlockReq *req, struct iovec *iov,
239
return -EINVAL;
240
}
241
242
-static int coroutine_fn vu_block_flush(VuBlockReq *req)
243
+static void coroutine_fn vu_blk_virtio_process_req(void *opaque)
244
{
245
- VuBlockDev *vdev_blk = get_vu_block_device_by_server(req->server);
246
- BlockBackend *backend = vdev_blk->backend;
247
- return blk_co_flush(backend);
248
-}
249
-
250
-static void coroutine_fn vu_block_virtio_process_req(void *opaque)
251
-{
252
- VuBlockReq *req = opaque;
253
+ VuBlkReq *req = opaque;
254
VuServer *server = req->server;
255
VuVirtqElement *elem = &req->elem;
256
uint32_t type;
257
258
- VuBlockDev *vdev_blk = get_vu_block_device_by_server(server);
259
- BlockBackend *backend = vdev_blk->backend;
260
+ VuBlkExport *vexp = container_of(server, VuBlkExport, vu_server);
261
+ BlockBackend *blk = vexp->export.blk;
262
263
struct iovec *in_iov = elem->in_sg;
264
struct iovec *out_iov = elem->out_sg;
265
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn vu_block_virtio_process_req(void *opaque)
266
bool is_write = type & VIRTIO_BLK_T_OUT;
267
req->sector_num = le64_to_cpu(req->out.sector);
268
269
- int64_t offset = req->sector_num * vdev_blk->blk_size;
270
+ if (is_write && !vexp->writable) {
271
+ req->in->status = VIRTIO_BLK_S_IOERR;
272
+ break;
273
+ }
274
+
275
+ int64_t offset = req->sector_num * vexp->blk_size;
276
QEMUIOVector qiov;
277
if (is_write) {
278
qemu_iovec_init_external(&qiov, out_iov, out_num);
279
- ret = blk_co_pwritev(backend, offset, qiov.size,
280
- &qiov, 0);
281
+ ret = blk_co_pwritev(blk, offset, qiov.size, &qiov, 0);
282
} else {
283
qemu_iovec_init_external(&qiov, in_iov, in_num);
284
- ret = blk_co_preadv(backend, offset, qiov.size,
285
- &qiov, 0);
286
+ ret = blk_co_preadv(blk, offset, qiov.size, &qiov, 0);
287
}
288
if (ret >= 0) {
289
req->in->status = VIRTIO_BLK_S_OK;
290
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn vu_block_virtio_process_req(void *opaque)
291
break;
292
}
293
case VIRTIO_BLK_T_FLUSH:
294
- if (vu_block_flush(req) == 0) {
295
+ if (blk_co_flush(blk) == 0) {
296
req->in->status = VIRTIO_BLK_S_OK;
297
} else {
298
req->in->status = VIRTIO_BLK_S_IOERR;
299
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn vu_block_virtio_process_req(void *opaque)
300
case VIRTIO_BLK_T_DISCARD:
301
case VIRTIO_BLK_T_WRITE_ZEROES: {
302
int rc;
303
- rc = vu_block_discard_write_zeroes(req, &elem->out_sg[1],
304
- out_num, type);
305
+
306
+ if (!vexp->writable) {
307
+ req->in->status = VIRTIO_BLK_S_IOERR;
308
+ break;
309
+ }
310
+
311
+ rc = vu_blk_discard_write_zeroes(blk, &elem->out_sg[1], out_num, type);
312
if (rc == 0) {
313
req->in->status = VIRTIO_BLK_S_OK;
314
} else {
315
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn vu_block_virtio_process_req(void *opaque)
316
break;
317
}
318
319
- vu_block_req_complete(req);
320
+ vu_blk_req_complete(req);
321
return;
322
323
err:
324
- free(elem);
325
+ free(req);
326
}
327
328
-static void vu_block_process_vq(VuDev *vu_dev, int idx)
329
+static void vu_blk_process_vq(VuDev *vu_dev, int idx)
330
{
331
VuServer *server = container_of(vu_dev, VuServer, vu_dev);
332
VuVirtq *vq = vu_get_queue(vu_dev, idx);
333
334
while (1) {
335
- VuBlockReq *req;
336
+ VuBlkReq *req;
337
338
- req = vu_queue_pop(vu_dev, vq, sizeof(VuBlockReq));
339
+ req = vu_queue_pop(vu_dev, vq, sizeof(VuBlkReq));
340
if (!req) {
52
break;
341
break;
53
}
342
}
343
@@ -XXX,XX +XXX,XX @@ static void vu_block_process_vq(VuDev *vu_dev, int idx)
344
req->vq = vq;
345
346
Coroutine *co =
347
- qemu_coroutine_create(vu_block_virtio_process_req, req);
348
+ qemu_coroutine_create(vu_blk_virtio_process_req, req);
349
qemu_coroutine_enter(co);
54
}
350
}
55
@@ -XXX,XX +XXX,XX @@ int cryptodev_backend_crypto_operation(
351
}
56
goto do_account;
352
353
-static void vu_block_queue_set_started(VuDev *vu_dev, int idx, bool started)
354
+static void vu_blk_queue_set_started(VuDev *vu_dev, int idx, bool started)
355
{
356
VuVirtq *vq;
357
358
assert(vu_dev);
359
360
vq = vu_get_queue(vu_dev, idx);
361
- vu_set_queue_handler(vu_dev, vq, started ? vu_block_process_vq : NULL);
362
+ vu_set_queue_handler(vu_dev, vq, started ? vu_blk_process_vq : NULL);
363
}
364
365
-static uint64_t vu_block_get_features(VuDev *dev)
366
+static uint64_t vu_blk_get_features(VuDev *dev)
367
{
368
uint64_t features;
369
VuServer *server = container_of(dev, VuServer, vu_dev);
370
- VuBlockDev *vdev_blk = get_vu_block_device_by_server(server);
371
+ VuBlkExport *vexp = container_of(server, VuBlkExport, vu_server);
372
features = 1ull << VIRTIO_BLK_F_SIZE_MAX |
373
1ull << VIRTIO_BLK_F_SEG_MAX |
374
1ull << VIRTIO_BLK_F_TOPOLOGY |
375
@@ -XXX,XX +XXX,XX @@ static uint64_t vu_block_get_features(VuDev *dev)
376
1ull << VIRTIO_RING_F_EVENT_IDX |
377
1ull << VHOST_USER_F_PROTOCOL_FEATURES;
378
379
- if (!vdev_blk->writable) {
380
+ if (!vexp->writable) {
381
features |= 1ull << VIRTIO_BLK_F_RO;
57
}
382
}
58
383
59
- if (throttle_schedule_timer(&backend->ts, &backend->tt, true) ||
384
return features;
60
+ if (throttle_schedule_timer(&backend->ts, &backend->tt, THROTTLE_WRITE) ||
385
}
61
!QTAILQ_EMPTY(&backend->opinfos)) {
386
62
QTAILQ_INSERT_TAIL(&backend->opinfos, op_info, next);
387
-static uint64_t vu_block_get_protocol_features(VuDev *dev)
63
return 0;
388
+static uint64_t vu_blk_get_protocol_features(VuDev *dev)
64
@@ -XXX,XX +XXX,XX @@ do_account:
389
{
65
return ret;
390
return 1ull << VHOST_USER_PROTOCOL_F_CONFIG |
391
1ull << VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD;
392
}
393
394
static int
395
-vu_block_get_config(VuDev *vu_dev, uint8_t *config, uint32_t len)
396
+vu_blk_get_config(VuDev *vu_dev, uint8_t *config, uint32_t len)
397
{
398
+ /* TODO blkcfg must be little-endian for VIRTIO 1.0 */
399
VuServer *server = container_of(vu_dev, VuServer, vu_dev);
400
- VuBlockDev *vdev_blk = get_vu_block_device_by_server(server);
401
- memcpy(config, &vdev_blk->blkcfg, len);
402
-
403
+ VuBlkExport *vexp = container_of(server, VuBlkExport, vu_server);
404
+ memcpy(config, &vexp->blkcfg, len);
405
return 0;
406
}
407
408
static int
409
-vu_block_set_config(VuDev *vu_dev, const uint8_t *data,
410
+vu_blk_set_config(VuDev *vu_dev, const uint8_t *data,
411
uint32_t offset, uint32_t size, uint32_t flags)
412
{
413
VuServer *server = container_of(vu_dev, VuServer, vu_dev);
414
- VuBlockDev *vdev_blk = get_vu_block_device_by_server(server);
415
+ VuBlkExport *vexp = container_of(server, VuBlkExport, vu_server);
416
uint8_t wce;
417
418
/* don't support live migration */
419
@@ -XXX,XX +XXX,XX @@ vu_block_set_config(VuDev *vu_dev, const uint8_t *data,
66
}
420
}
67
421
68
- throttle_account(&backend->ts, true, ret);
422
wce = *data;
69
+ throttle_account(&backend->ts, THROTTLE_WRITE, ret);
423
- vdev_blk->blkcfg.wce = wce;
70
424
- blk_set_enable_write_cache(vdev_blk->backend, wce);
71
return cryptodev_backend_operation(backend, op_info);
425
+ vexp->blkcfg.wce = wce;
72
}
426
+ blk_set_enable_write_cache(vexp->export.blk, wce);
73
diff --git a/block/throttle-groups.c b/block/throttle-groups.c
427
return 0;
428
}
429
430
@@ -XXX,XX +XXX,XX @@ vu_block_set_config(VuDev *vu_dev, const uint8_t *data,
431
* of vu_process_message.
432
*
433
*/
434
-static int vu_block_process_msg(VuDev *dev, VhostUserMsg *vmsg, int *do_reply)
435
+static int vu_blk_process_msg(VuDev *dev, VhostUserMsg *vmsg, int *do_reply)
436
{
437
if (vmsg->request == VHOST_USER_NONE) {
438
dev->panic(dev, "disconnect");
439
@@ -XXX,XX +XXX,XX @@ static int vu_block_process_msg(VuDev *dev, VhostUserMsg *vmsg, int *do_reply)
440
return false;
441
}
442
443
-static const VuDevIface vu_block_iface = {
444
- .get_features = vu_block_get_features,
445
- .queue_set_started = vu_block_queue_set_started,
446
- .get_protocol_features = vu_block_get_protocol_features,
447
- .get_config = vu_block_get_config,
448
- .set_config = vu_block_set_config,
449
- .process_msg = vu_block_process_msg,
450
+static const VuDevIface vu_blk_iface = {
451
+ .get_features = vu_blk_get_features,
452
+ .queue_set_started = vu_blk_queue_set_started,
453
+ .get_protocol_features = vu_blk_get_protocol_features,
454
+ .get_config = vu_blk_get_config,
455
+ .set_config = vu_blk_set_config,
456
+ .process_msg = vu_blk_process_msg,
457
};
458
459
static void blk_aio_attached(AioContext *ctx, void *opaque)
460
{
461
- VuBlockDev *vub_dev = opaque;
462
- vhost_user_server_attach_aio_context(&vub_dev->vu_server, ctx);
463
+ VuBlkExport *vexp = opaque;
464
+ vhost_user_server_attach_aio_context(&vexp->vu_server, ctx);
465
}
466
467
static void blk_aio_detach(void *opaque)
468
{
469
- VuBlockDev *vub_dev = opaque;
470
- vhost_user_server_detach_aio_context(&vub_dev->vu_server);
471
+ VuBlkExport *vexp = opaque;
472
+ vhost_user_server_detach_aio_context(&vexp->vu_server);
473
}
474
475
static void
476
-vu_block_initialize_config(BlockDriverState *bs,
477
+vu_blk_initialize_config(BlockDriverState *bs,
478
struct virtio_blk_config *config, uint32_t blk_size)
479
{
480
config->capacity = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
481
@@ -XXX,XX +XXX,XX @@ vu_block_initialize_config(BlockDriverState *bs,
482
config->max_write_zeroes_seg = 1;
483
}
484
485
-static VuBlockDev *vu_block_init(VuBlockDev *vu_block_device, Error **errp)
486
+static void vu_blk_exp_request_shutdown(BlockExport *exp)
487
{
488
+ VuBlkExport *vexp = container_of(exp, VuBlkExport, export);
489
490
- BlockBackend *blk;
491
- Error *local_error = NULL;
492
- const char *node_name = vu_block_device->node_name;
493
- bool writable = vu_block_device->writable;
494
- uint64_t perm = BLK_PERM_CONSISTENT_READ;
495
- int ret;
496
-
497
- AioContext *ctx;
498
-
499
- BlockDriverState *bs = bdrv_lookup_bs(node_name, node_name, &local_error);
500
-
501
- if (!bs) {
502
- error_propagate(errp, local_error);
503
- return NULL;
504
- }
505
-
506
- if (bdrv_is_read_only(bs)) {
507
- writable = false;
508
- }
509
-
510
- if (writable) {
511
- perm |= BLK_PERM_WRITE;
512
- }
513
-
514
- ctx = bdrv_get_aio_context(bs);
515
- aio_context_acquire(ctx);
516
- bdrv_invalidate_cache(bs, NULL);
517
- aio_context_release(ctx);
518
-
519
- /*
520
- * Don't allow resize while the vhost user server is running,
521
- * otherwise we don't care what happens with the node.
522
- */
523
- blk = blk_new(bdrv_get_aio_context(bs), perm,
524
- BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE_UNCHANGED |
525
- BLK_PERM_WRITE | BLK_PERM_GRAPH_MOD);
526
- ret = blk_insert_bs(blk, bs, errp);
527
-
528
- if (ret < 0) {
529
- goto fail;
530
- }
531
-
532
- blk_set_enable_write_cache(blk, false);
533
-
534
- blk_set_allow_aio_context_change(blk, true);
535
-
536
- vu_block_device->blkcfg.wce = 0;
537
- vu_block_device->backend = blk;
538
- if (!vu_block_device->blk_size) {
539
- vu_block_device->blk_size = BDRV_SECTOR_SIZE;
540
- }
541
- vu_block_device->blkcfg.blk_size = vu_block_device->blk_size;
542
- blk_set_guest_block_size(blk, vu_block_device->blk_size);
543
- vu_block_initialize_config(bs, &vu_block_device->blkcfg,
544
- vu_block_device->blk_size);
545
- return vu_block_device;
546
-
547
-fail:
548
- blk_unref(blk);
549
- return NULL;
550
-}
551
-
552
-static void vu_block_deinit(VuBlockDev *vu_block_device)
553
-{
554
- if (vu_block_device->backend) {
555
- blk_remove_aio_context_notifier(vu_block_device->backend, blk_aio_attached,
556
- blk_aio_detach, vu_block_device);
557
- }
558
-
559
- blk_unref(vu_block_device->backend);
560
-}
561
-
562
-static void vhost_user_blk_server_stop(VuBlockDev *vu_block_device)
563
-{
564
- vhost_user_server_stop(&vu_block_device->vu_server);
565
- vu_block_deinit(vu_block_device);
566
-}
567
-
568
-static void vhost_user_blk_server_start(VuBlockDev *vu_block_device,
569
- Error **errp)
570
-{
571
- AioContext *ctx;
572
- SocketAddress *addr = vu_block_device->addr;
573
-
574
- if (!vu_block_init(vu_block_device, errp)) {
575
- return;
576
- }
577
-
578
- ctx = bdrv_get_aio_context(blk_bs(vu_block_device->backend));
579
-
580
- if (!vhost_user_server_start(&vu_block_device->vu_server, addr, ctx,
581
- VHOST_USER_BLK_MAX_QUEUES, &vu_block_iface,
582
- errp)) {
583
- goto error;
584
- }
585
-
586
- blk_add_aio_context_notifier(vu_block_device->backend, blk_aio_attached,
587
- blk_aio_detach, vu_block_device);
588
- vu_block_device->running = true;
589
- return;
590
-
591
- error:
592
- vu_block_deinit(vu_block_device);
593
-}
594
-
595
-static bool vu_prop_modifiable(VuBlockDev *vus, Error **errp)
596
-{
597
- if (vus->running) {
598
- error_setg(errp, "The property can't be modified "
599
- "while the server is running");
600
- return false;
601
- }
602
- return true;
603
-}
604
-
605
-static void vu_set_node_name(Object *obj, const char *value, Error **errp)
606
-{
607
- VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
608
-
609
- if (!vu_prop_modifiable(vus, errp)) {
610
- return;
611
- }
612
-
613
- if (vus->node_name) {
614
- g_free(vus->node_name);
615
- }
616
-
617
- vus->node_name = g_strdup(value);
618
-}
619
-
620
-static char *vu_get_node_name(Object *obj, Error **errp)
621
-{
622
- VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
623
- return g_strdup(vus->node_name);
624
-}
625
-
626
-static void free_socket_addr(SocketAddress *addr)
627
-{
628
- g_free(addr->u.q_unix.path);
629
- g_free(addr);
630
-}
631
-
632
-static void vu_set_unix_socket(Object *obj, const char *value,
633
- Error **errp)
634
-{
635
- VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
636
-
637
- if (!vu_prop_modifiable(vus, errp)) {
638
- return;
639
- }
640
-
641
- if (vus->addr) {
642
- free_socket_addr(vus->addr);
643
- }
644
-
645
- SocketAddress *addr = g_new0(SocketAddress, 1);
646
- addr->type = SOCKET_ADDRESS_TYPE_UNIX;
647
- addr->u.q_unix.path = g_strdup(value);
648
- vus->addr = addr;
649
+ vhost_user_server_stop(&vexp->vu_server);
650
}
651
652
-static char *vu_get_unix_socket(Object *obj, Error **errp)
653
+static int vu_blk_exp_create(BlockExport *exp, BlockExportOptions *opts,
654
+ Error **errp)
655
{
656
- VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
657
- return g_strdup(vus->addr->u.q_unix.path);
658
-}
659
-
660
-static bool vu_get_block_writable(Object *obj, Error **errp)
661
-{
662
- VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
663
- return vus->writable;
664
-}
665
-
666
-static void vu_set_block_writable(Object *obj, bool value, Error **errp)
667
-{
668
- VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
669
-
670
- if (!vu_prop_modifiable(vus, errp)) {
671
- return;
672
- }
673
-
674
- vus->writable = value;
675
-}
676
-
677
-static void vu_get_blk_size(Object *obj, Visitor *v, const char *name,
678
- void *opaque, Error **errp)
679
-{
680
- VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
681
- uint32_t value = vus->blk_size;
682
-
683
- visit_type_uint32(v, name, &value, errp);
684
-}
685
-
686
-static void vu_set_blk_size(Object *obj, Visitor *v, const char *name,
687
- void *opaque, Error **errp)
688
-{
689
- VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
690
-
691
+ VuBlkExport *vexp = container_of(exp, VuBlkExport, export);
692
+ BlockExportOptionsVhostUserBlk *vu_opts = &opts->u.vhost_user_blk;
693
Error *local_err = NULL;
694
- uint32_t value;
695
+ uint64_t logical_block_size;
696
697
- if (!vu_prop_modifiable(vus, errp)) {
698
- return;
699
- }
700
+ vexp->writable = opts->writable;
701
+ vexp->blkcfg.wce = 0;
702
703
- visit_type_uint32(v, name, &value, &local_err);
704
- if (local_err) {
705
- goto out;
706
+ if (vu_opts->has_logical_block_size) {
707
+ logical_block_size = vu_opts->logical_block_size;
708
+ } else {
709
+ logical_block_size = BDRV_SECTOR_SIZE;
710
}
711
-
712
- check_block_size(object_get_typename(obj), name, value, &local_err);
713
+ check_block_size(exp->id, "logical-block-size", logical_block_size,
714
+ &local_err);
715
if (local_err) {
716
- goto out;
717
+ error_propagate(errp, local_err);
718
+ return -EINVAL;
719
+ }
720
+ vexp->blk_size = logical_block_size;
721
+ blk_set_guest_block_size(exp->blk, logical_block_size);
722
+ vu_blk_initialize_config(blk_bs(exp->blk), &vexp->blkcfg,
723
+ logical_block_size);
724
+
725
+ blk_set_allow_aio_context_change(exp->blk, true);
726
+ blk_add_aio_context_notifier(exp->blk, blk_aio_attached, blk_aio_detach,
727
+ vexp);
728
+
729
+ if (!vhost_user_server_start(&vexp->vu_server, vu_opts->addr, exp->ctx,
730
+ VHOST_USER_BLK_MAX_QUEUES, &vu_blk_iface,
731
+ errp)) {
732
+ blk_remove_aio_context_notifier(exp->blk, blk_aio_attached,
733
+ blk_aio_detach, vexp);
734
+ return -EADDRNOTAVAIL;
735
}
736
737
- vus->blk_size = value;
738
-
739
-out:
740
- error_propagate(errp, local_err);
741
-}
742
-
743
-static void vhost_user_blk_server_instance_finalize(Object *obj)
744
-{
745
- VuBlockDev *vub = VHOST_USER_BLK_SERVER(obj);
746
-
747
- vhost_user_blk_server_stop(vub);
748
-
749
- /*
750
- * Unlike object_property_add_str, object_class_property_add_str
751
- * doesn't have a release method. Thus manual memory freeing is
752
- * needed.
753
- */
754
- free_socket_addr(vub->addr);
755
- g_free(vub->node_name);
756
-}
757
-
758
-static void vhost_user_blk_server_complete(UserCreatable *obj, Error **errp)
759
-{
760
- VuBlockDev *vub = VHOST_USER_BLK_SERVER(obj);
761
-
762
- vhost_user_blk_server_start(vub, errp);
763
+ return 0;
764
}
765
766
-static void vhost_user_blk_server_class_init(ObjectClass *klass,
767
- void *class_data)
768
+static void vu_blk_exp_delete(BlockExport *exp)
769
{
770
- UserCreatableClass *ucc = USER_CREATABLE_CLASS(klass);
771
- ucc->complete = vhost_user_blk_server_complete;
772
-
773
- object_class_property_add_bool(klass, "writable",
774
- vu_get_block_writable,
775
- vu_set_block_writable);
776
-
777
- object_class_property_add_str(klass, "node-name",
778
- vu_get_node_name,
779
- vu_set_node_name);
780
-
781
- object_class_property_add_str(klass, "unix-socket",
782
- vu_get_unix_socket,
783
- vu_set_unix_socket);
784
+ VuBlkExport *vexp = container_of(exp, VuBlkExport, export);
785
786
- object_class_property_add(klass, "logical-block-size", "uint32",
787
- vu_get_blk_size, vu_set_blk_size,
788
- NULL, NULL);
789
+ blk_remove_aio_context_notifier(exp->blk, blk_aio_attached, blk_aio_detach,
790
+ vexp);
791
}
792
793
-static const TypeInfo vhost_user_blk_server_info = {
794
- .name = TYPE_VHOST_USER_BLK_SERVER,
795
- .parent = TYPE_OBJECT,
796
- .instance_size = sizeof(VuBlockDev),
797
- .instance_finalize = vhost_user_blk_server_instance_finalize,
798
- .class_init = vhost_user_blk_server_class_init,
799
- .interfaces = (InterfaceInfo[]) {
800
- {TYPE_USER_CREATABLE},
801
- {}
802
- },
803
+const BlockExportDriver blk_exp_vhost_user_blk = {
804
+ .type = BLOCK_EXPORT_TYPE_VHOST_USER_BLK,
805
+ .instance_size = sizeof(VuBlkExport),
806
+ .create = vu_blk_exp_create,
807
+ .delete = vu_blk_exp_delete,
808
+ .request_shutdown = vu_blk_exp_request_shutdown,
809
};
810
-
811
-static void vhost_user_blk_server_register_types(void)
812
-{
813
- type_register_static(&vhost_user_blk_server_info);
814
-}
815
-
816
-type_init(vhost_user_blk_server_register_types)
817
diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
74
index XXXXXXX..XXXXXXX 100644
818
index XXXXXXX..XXXXXXX 100644
75
--- a/block/throttle-groups.c
819
--- a/util/vhost-user-server.c
76
+++ b/block/throttle-groups.c
820
+++ b/util/vhost-user-server.c
77
@@ -XXX,XX +XXX,XX @@ static bool throttle_group_schedule_timer(ThrottleGroupMember *tgm,
821
@@ -XXX,XX +XXX,XX @@ bool vhost_user_server_start(VuServer *server,
78
ThrottleState *ts = tgm->throttle_state;
822
Error **errp)
79
ThrottleGroup *tg = container_of(ts, ThrottleGroup, ts);
823
{
80
ThrottleTimers *tt = &tgm->throttle_timers;
824
QEMUBH *bh;
81
+ ThrottleDirection direction = is_write ? THROTTLE_WRITE : THROTTLE_READ;
825
- QIONetListener *listener = qio_net_listener_new();
82
bool must_wait;
826
+ QIONetListener *listener;
83
827
+
84
if (qatomic_read(&tgm->io_limits_disabled)) {
828
+ if (socket_addr->type != SOCKET_ADDRESS_TYPE_UNIX &&
85
@@ -XXX,XX +XXX,XX @@ static bool throttle_group_schedule_timer(ThrottleGroupMember *tgm,
829
+ socket_addr->type != SOCKET_ADDRESS_TYPE_FD) {
86
return true;
830
+ error_setg(errp, "Only socket address types 'unix' and 'fd' are supported");
87
}
831
+ return false;
88
832
+ }
89
- must_wait = throttle_schedule_timer(ts, tt, is_write);
833
+
90
+ must_wait = throttle_schedule_timer(ts, tt, direction);
834
+ listener = qio_net_listener_new();
91
835
if (qio_net_listener_open_sync(listener, socket_addr, 1,
92
/* If a timer just got armed, set tgm as the current token */
836
errp) < 0) {
93
if (must_wait) {
837
object_unref(OBJECT(listener));
94
@@ -XXX,XX +XXX,XX @@ void coroutine_fn throttle_group_co_io_limits_intercept(ThrottleGroupMember *tgm
838
diff --git a/block/export/meson.build b/block/export/meson.build
95
bool must_wait;
96
ThrottleGroupMember *token;
97
ThrottleGroup *tg = container_of(tgm->throttle_state, ThrottleGroup, ts);
98
+ ThrottleDirection direction = is_write ? THROTTLE_WRITE : THROTTLE_READ;
99
100
assert(bytes >= 0);
101
102
@@ -XXX,XX +XXX,XX @@ void coroutine_fn throttle_group_co_io_limits_intercept(ThrottleGroupMember *tgm
103
}
104
105
/* The I/O will be executed, so do the accounting */
106
- throttle_account(tgm->throttle_state, is_write, bytes);
107
+ throttle_account(tgm->throttle_state, direction, bytes);
108
109
/* Schedule the next request */
110
schedule_next_request(tgm, is_write);
111
diff --git a/fsdev/qemu-fsdev-throttle.c b/fsdev/qemu-fsdev-throttle.c
112
index XXXXXXX..XXXXXXX 100644
839
index XXXXXXX..XXXXXXX 100644
113
--- a/fsdev/qemu-fsdev-throttle.c
840
--- a/block/export/meson.build
114
+++ b/fsdev/qemu-fsdev-throttle.c
841
+++ b/block/export/meson.build
115
@@ -XXX,XX +XXX,XX @@ void fsdev_throttle_init(FsThrottle *fst)
842
@@ -1 +1,2 @@
116
void coroutine_fn fsdev_co_throttle_request(FsThrottle *fst, bool is_write,
843
block_ss.add(files('export.c'))
117
struct iovec *iov, int iovcnt)
844
+block_ss.add(when: 'CONFIG_LINUX', if_true: files('vhost-user-blk-server.c', '../../contrib/libvhost-user/libvhost-user.c'))
118
{
845
diff --git a/block/meson.build b/block/meson.build
119
+ ThrottleDirection direction = is_write ? THROTTLE_WRITE : THROTTLE_READ;
120
+
121
if (throttle_enabled(&fst->cfg)) {
122
- if (throttle_schedule_timer(&fst->ts, &fst->tt, is_write) ||
123
+ if (throttle_schedule_timer(&fst->ts, &fst->tt, direction) ||
124
!qemu_co_queue_empty(&fst->throttled_reqs[is_write])) {
125
qemu_co_queue_wait(&fst->throttled_reqs[is_write], NULL);
126
}
127
128
- throttle_account(&fst->ts, is_write, iov_size(iov, iovcnt));
129
+ throttle_account(&fst->ts, direction, iov_size(iov, iovcnt));
130
131
if (!qemu_co_queue_empty(&fst->throttled_reqs[is_write]) &&
132
- !throttle_schedule_timer(&fst->ts, &fst->tt, is_write)) {
133
+ !throttle_schedule_timer(&fst->ts, &fst->tt, direction)) {
134
qemu_co_queue_next(&fst->throttled_reqs[is_write]);
135
}
136
}
137
diff --git a/tests/unit/test-throttle.c b/tests/unit/test-throttle.c
138
index XXXXXXX..XXXXXXX 100644
846
index XXXXXXX..XXXXXXX 100644
139
--- a/tests/unit/test-throttle.c
847
--- a/block/meson.build
140
+++ b/tests/unit/test-throttle.c
848
+++ b/block/meson.build
141
@@ -XXX,XX +XXX,XX @@ static bool do_test_accounting(bool is_ops, /* are we testing bps or ops */
849
@@ -XXX,XX +XXX,XX @@ block_ss.add(when: 'CONFIG_WIN32', if_true: files('file-win32.c', 'win32-aio.c')
142
throttle_config(&ts, QEMU_CLOCK_VIRTUAL, &cfg);
850
block_ss.add(when: 'CONFIG_POSIX', if_true: [files('file-posix.c'), coref, iokit])
143
851
block_ss.add(when: 'CONFIG_LIBISCSI', if_true: files('iscsi-opts.c'))
144
/* account a read */
852
block_ss.add(when: 'CONFIG_LINUX', if_true: files('nvme.c'))
145
- throttle_account(&ts, false, size);
853
-block_ss.add(when: 'CONFIG_LINUX', if_true: files('export/vhost-user-blk-server.c', '../contrib/libvhost-user/libvhost-user.c'))
146
+ throttle_account(&ts, THROTTLE_READ, size);
854
block_ss.add(when: 'CONFIG_REPLICATION', if_true: files('replication.c'))
147
/* account a write */
855
block_ss.add(when: 'CONFIG_SHEEPDOG', if_true: files('sheepdog.c'))
148
- throttle_account(&ts, true, size);
856
block_ss.add(when: ['CONFIG_LINUX_AIO', libaio], if_true: files('linux-aio.c'))
149
+ throttle_account(&ts, THROTTLE_WRITE, size);
150
151
/* check total result */
152
index = to_test[is_ops][0];
153
diff --git a/util/throttle.c b/util/throttle.c
154
index XXXXXXX..XXXXXXX 100644
155
--- a/util/throttle.c
156
+++ b/util/throttle.c
157
@@ -XXX,XX +XXX,XX @@ int64_t throttle_compute_wait(LeakyBucket *bkt)
158
159
/* This function compute the time that must be waited while this IO
160
*
161
- * @is_write: true if the current IO is a write, false if it's a read
162
+ * @direction: throttle direction
163
* @ret: time to wait
164
*/
165
static int64_t throttle_compute_wait_for(ThrottleState *ts,
166
- bool is_write)
167
+ ThrottleDirection direction)
168
{
169
BucketType to_check[2][4] = { {THROTTLE_BPS_TOTAL,
170
THROTTLE_OPS_TOTAL,
171
@@ -XXX,XX +XXX,XX @@ static int64_t throttle_compute_wait_for(ThrottleState *ts,
172
int i;
173
174
for (i = 0; i < 4; i++) {
175
- BucketType index = to_check[is_write][i];
176
+ BucketType index = to_check[direction][i];
177
wait = throttle_compute_wait(&ts->cfg.buckets[index]);
178
if (wait > max_wait) {
179
max_wait = wait;
180
@@ -XXX,XX +XXX,XX @@ static int64_t throttle_compute_wait_for(ThrottleState *ts,
181
182
/* compute the timer for this type of operation
183
*
184
- * @is_write: the type of operation
185
+ * @direction: throttle direction
186
* @now: the current clock timestamp
187
* @next_timestamp: the resulting timer
188
* @ret: true if a timer must be set
189
*/
190
static bool throttle_compute_timer(ThrottleState *ts,
191
- bool is_write,
192
+ ThrottleDirection direction,
193
int64_t now,
194
int64_t *next_timestamp)
195
{
196
@@ -XXX,XX +XXX,XX @@ static bool throttle_compute_timer(ThrottleState *ts,
197
throttle_do_leak(ts, now);
198
199
/* compute the wait time if any */
200
- wait = throttle_compute_wait_for(ts, is_write);
201
+ wait = throttle_compute_wait_for(ts, direction);
202
203
/* if the code must wait compute when the next timer should fire */
204
if (wait) {
205
@@ -XXX,XX +XXX,XX @@ void throttle_get_config(ThrottleState *ts, ThrottleConfig *cfg)
206
* NOTE: this function is not unit tested due to it's usage of timer_mod
207
*
208
* @tt: the timers structure
209
- * @is_write: the type of operation (read/write)
210
+ * @direction: throttle direction
211
* @ret: true if the timer has been scheduled else false
212
*/
213
bool throttle_schedule_timer(ThrottleState *ts,
214
ThrottleTimers *tt,
215
- bool is_write)
216
+ ThrottleDirection direction)
217
{
218
int64_t now = qemu_clock_get_ns(tt->clock_type);
219
int64_t next_timestamp;
220
QEMUTimer *timer;
221
bool must_wait;
222
223
- timer = is_write ? tt->timers[THROTTLE_WRITE] : tt->timers[THROTTLE_READ];
224
+ assert(direction < THROTTLE_MAX);
225
+ timer = tt->timers[direction];
226
assert(timer);
227
228
must_wait = throttle_compute_timer(ts,
229
- is_write,
230
+ direction,
231
now,
232
&next_timestamp);
233
234
@@ -XXX,XX +XXX,XX @@ bool throttle_schedule_timer(ThrottleState *ts,
235
236
/* do the accounting for this operation
237
*
238
- * @is_write: the type of operation (read/write)
239
+ * @direction: throttle direction
240
* @size: the size of the operation
241
*/
242
-void throttle_account(ThrottleState *ts, bool is_write, uint64_t size)
243
+void throttle_account(ThrottleState *ts, ThrottleDirection direction,
244
+ uint64_t size)
245
{
246
const BucketType bucket_types_size[2][2] = {
247
{ THROTTLE_BPS_TOTAL, THROTTLE_BPS_READ },
248
@@ -XXX,XX +XXX,XX @@ void throttle_account(ThrottleState *ts, bool is_write, uint64_t size)
249
double units = 1.0;
250
unsigned i;
251
252
+ assert(direction < THROTTLE_MAX);
253
/* if cfg.op_size is defined and smaller than size we compute unit count */
254
if (ts->cfg.op_size && size > ts->cfg.op_size) {
255
units = (double) size / ts->cfg.op_size;
256
@@ -XXX,XX +XXX,XX @@ void throttle_account(ThrottleState *ts, bool is_write, uint64_t size)
257
for (i = 0; i < 2; i++) {
258
LeakyBucket *bkt;
259
260
- bkt = &ts->cfg.buckets[bucket_types_size[is_write][i]];
261
+ bkt = &ts->cfg.buckets[bucket_types_size[direction][i]];
262
bkt->level += size;
263
if (bkt->burst_length > 1) {
264
bkt->burst_level += size;
265
}
266
267
- bkt = &ts->cfg.buckets[bucket_types_units[is_write][i]];
268
+ bkt = &ts->cfg.buckets[bucket_types_units[direction][i]];
269
bkt->level += units;
270
if (bkt->burst_length > 1) {
271
bkt->burst_level += units;
272
--
857
--
273
2.41.0
858
2.26.2
859
diff view generated by jsdifflib
New patch
1
Headers used by other subsystems are located in include/. Also add the
2
vhost-user-server and vhost-user-blk-server headers to MAINTAINERS.
1
3
4
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
5
Message-id: 20200924151549.913737-13-stefanha@redhat.com
6
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
7
---
8
MAINTAINERS | 4 +++-
9
{util => include/qemu}/vhost-user-server.h | 0
10
block/export/vhost-user-blk-server.c | 2 +-
11
util/vhost-user-server.c | 2 +-
12
4 files changed, 5 insertions(+), 3 deletions(-)
13
rename {util => include/qemu}/vhost-user-server.h (100%)
14
15
diff --git a/MAINTAINERS b/MAINTAINERS
16
index XXXXXXX..XXXXXXX 100644
17
--- a/MAINTAINERS
18
+++ b/MAINTAINERS
19
@@ -XXX,XX +XXX,XX @@ Vhost-user block device backend server
20
M: Coiby Xu <Coiby.Xu@gmail.com>
21
S: Maintained
22
F: block/export/vhost-user-blk-server.c
23
-F: util/vhost-user-server.c
24
+F: block/export/vhost-user-blk-server.h
25
+F: include/qemu/vhost-user-server.h
26
F: tests/qtest/libqos/vhost-user-blk.c
27
+F: util/vhost-user-server.c
28
29
Replication
30
M: Wen Congyang <wencongyang2@huawei.com>
31
diff --git a/util/vhost-user-server.h b/include/qemu/vhost-user-server.h
32
similarity index 100%
33
rename from util/vhost-user-server.h
34
rename to include/qemu/vhost-user-server.h
35
diff --git a/block/export/vhost-user-blk-server.c b/block/export/vhost-user-blk-server.c
36
index XXXXXXX..XXXXXXX 100644
37
--- a/block/export/vhost-user-blk-server.c
38
+++ b/block/export/vhost-user-blk-server.c
39
@@ -XXX,XX +XXX,XX @@
40
#include "block/block.h"
41
#include "contrib/libvhost-user/libvhost-user.h"
42
#include "standard-headers/linux/virtio_blk.h"
43
-#include "util/vhost-user-server.h"
44
+#include "qemu/vhost-user-server.h"
45
#include "vhost-user-blk-server.h"
46
#include "qapi/error.h"
47
#include "qom/object_interfaces.h"
48
diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
49
index XXXXXXX..XXXXXXX 100644
50
--- a/util/vhost-user-server.c
51
+++ b/util/vhost-user-server.c
52
@@ -XXX,XX +XXX,XX @@
53
*/
54
#include "qemu/osdep.h"
55
#include "qemu/main-loop.h"
56
+#include "qemu/vhost-user-server.h"
57
#include "block/aio-wait.h"
58
-#include "vhost-user-server.h"
59
60
/*
61
* Theory of operation:
62
--
63
2.26.2
64
diff view generated by jsdifflib
New patch
1
Don't compile contrib/libvhost-user/libvhost-user.c again. Instead build
2
the static library once and then reuse it throughout QEMU.
1
3
4
Also switch from CONFIG_LINUX to CONFIG_VHOST_USER, which is what the
5
vhost-user tools (vhost-user-gpu, etc) do.
6
7
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
8
Message-id: 20200924151549.913737-14-stefanha@redhat.com
9
[Added CONFIG_LINUX again because libvhost-user doesn't build on macOS.
10
--Stefan]
11
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
12
---
13
block/export/export.c | 8 ++++----
14
block/export/meson.build | 2 +-
15
contrib/libvhost-user/meson.build | 1 +
16
meson.build | 6 +++++-
17
util/meson.build | 4 +++-
18
5 files changed, 14 insertions(+), 7 deletions(-)
19
20
diff --git a/block/export/export.c b/block/export/export.c
21
index XXXXXXX..XXXXXXX 100644
22
--- a/block/export/export.c
23
+++ b/block/export/export.c
24
@@ -XXX,XX +XXX,XX @@
25
#include "sysemu/block-backend.h"
26
#include "block/export.h"
27
#include "block/nbd.h"
28
-#if CONFIG_LINUX
29
-#include "block/export/vhost-user-blk-server.h"
30
-#endif
31
#include "qapi/error.h"
32
#include "qapi/qapi-commands-block-export.h"
33
#include "qapi/qapi-events-block-export.h"
34
#include "qemu/id.h"
35
+#ifdef CONFIG_VHOST_USER
36
+#include "vhost-user-blk-server.h"
37
+#endif
38
39
static const BlockExportDriver *blk_exp_drivers[] = {
40
&blk_exp_nbd,
41
-#if CONFIG_LINUX
42
+#ifdef CONFIG_VHOST_USER
43
&blk_exp_vhost_user_blk,
44
#endif
45
};
46
diff --git a/block/export/meson.build b/block/export/meson.build
47
index XXXXXXX..XXXXXXX 100644
48
--- a/block/export/meson.build
49
+++ b/block/export/meson.build
50
@@ -XXX,XX +XXX,XX @@
51
block_ss.add(files('export.c'))
52
-block_ss.add(when: 'CONFIG_LINUX', if_true: files('vhost-user-blk-server.c', '../../contrib/libvhost-user/libvhost-user.c'))
53
+block_ss.add(when: ['CONFIG_LINUX', 'CONFIG_VHOST_USER'], if_true: files('vhost-user-blk-server.c'))
54
diff --git a/contrib/libvhost-user/meson.build b/contrib/libvhost-user/meson.build
55
index XXXXXXX..XXXXXXX 100644
56
--- a/contrib/libvhost-user/meson.build
57
+++ b/contrib/libvhost-user/meson.build
58
@@ -XXX,XX +XXX,XX @@
59
libvhost_user = static_library('vhost-user',
60
files('libvhost-user.c', 'libvhost-user-glib.c'),
61
build_by_default: false)
62
+vhost_user = declare_dependency(link_with: libvhost_user)
63
diff --git a/meson.build b/meson.build
64
index XXXXXXX..XXXXXXX 100644
65
--- a/meson.build
66
+++ b/meson.build
67
@@ -XXX,XX +XXX,XX @@ trace_events_subdirs += [
68
'util',
69
]
70
71
+vhost_user = not_found
72
+if 'CONFIG_VHOST_USER' in config_host
73
+ subdir('contrib/libvhost-user')
74
+endif
75
+
76
subdir('qapi')
77
subdir('qobject')
78
subdir('stubs')
79
@@ -XXX,XX +XXX,XX @@ if have_tools
80
install: true)
81
82
if 'CONFIG_VHOST_USER' in config_host
83
- subdir('contrib/libvhost-user')
84
subdir('contrib/vhost-user-blk')
85
subdir('contrib/vhost-user-gpu')
86
subdir('contrib/vhost-user-input')
87
diff --git a/util/meson.build b/util/meson.build
88
index XXXXXXX..XXXXXXX 100644
89
--- a/util/meson.build
90
+++ b/util/meson.build
91
@@ -XXX,XX +XXX,XX @@ if have_block
92
util_ss.add(files('main-loop.c'))
93
util_ss.add(files('nvdimm-utils.c'))
94
util_ss.add(files('qemu-coroutine.c', 'qemu-coroutine-lock.c', 'qemu-coroutine-io.c'))
95
- util_ss.add(when: 'CONFIG_LINUX', if_true: files('vhost-user-server.c'))
96
+ util_ss.add(when: ['CONFIG_LINUX', 'CONFIG_VHOST_USER'], if_true: [
97
+ files('vhost-user-server.c'), vhost_user
98
+ ])
99
util_ss.add(files('block-helpers.c'))
100
util_ss.add(files('qemu-coroutine-sleep.c'))
101
util_ss.add(files('qemu-co-shared-resource.c'))
102
--
103
2.26.2
104
diff view generated by jsdifflib
1
We duplicate the same condition three times here, pull it out to the top
1
Introduce libblkdev.fa to avoid recompiling blockdev_ss twice.
2
level.
3
2
4
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
3
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
5
Message-Id: <20230824155345.109765-5-hreitz@redhat.com>
4
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
6
Reviewed-by: Sam Li <faithilikerun@gmail.com>
5
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
6
Message-id: 20200929125516.186715-3-stefanha@redhat.com
7
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
7
---
8
---
8
block/file-posix.c | 18 +++++-------------
9
meson.build | 12 ++++++++++--
9
1 file changed, 5 insertions(+), 13 deletions(-)
10
storage-daemon/meson.build | 3 +--
11
2 files changed, 11 insertions(+), 4 deletions(-)
10
12
11
diff --git a/block/file-posix.c b/block/file-posix.c
13
diff --git a/meson.build b/meson.build
12
index XXXXXXX..XXXXXXX 100644
14
index XXXXXXX..XXXXXXX 100644
13
--- a/block/file-posix.c
15
--- a/meson.build
14
+++ b/block/file-posix.c
16
+++ b/meson.build
15
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn raw_co_prw(BlockDriverState *bs, uint64_t offset,
17
@@ -XXX,XX +XXX,XX @@ blockdev_ss.add(files(
16
18
# os-win32.c does not
17
out:
19
blockdev_ss.add(when: 'CONFIG_POSIX', if_true: files('os-posix.c'))
18
#if defined(CONFIG_BLKZONED)
20
softmmu_ss.add(when: 'CONFIG_WIN32', if_true: [files('os-win32.c')])
19
-{
21
-softmmu_ss.add_all(blockdev_ss)
20
- BlockZoneWps *wps = bs->wps;
22
21
- if (ret == 0) {
23
common_ss.add(files('cpus-common.c'))
22
- if ((type & (QEMU_AIO_WRITE | QEMU_AIO_ZONE_APPEND)) &&
24
23
- bs->bl.zoned != BLK_Z_NONE) {
25
@@ -XXX,XX +XXX,XX @@ block = declare_dependency(link_whole: [libblock],
24
+ if ((type & (QEMU_AIO_WRITE | QEMU_AIO_ZONE_APPEND)) &&
26
link_args: '@block.syms',
25
+ bs->bl.zoned != BLK_Z_NONE) {
27
dependencies: [crypto, io])
26
+ BlockZoneWps *wps = bs->wps;
28
27
+ if (ret == 0) {
29
+blockdev_ss = blockdev_ss.apply(config_host, strict: false)
28
uint64_t *wp = &wps->wp[offset / bs->bl.zone_size];
30
+libblockdev = static_library('blockdev', blockdev_ss.sources() + genh,
29
if (!BDRV_ZT_IS_CONV(*wp)) {
31
+ dependencies: blockdev_ss.dependencies(),
30
if (type & QEMU_AIO_ZONE_APPEND) {
32
+ name_suffix: 'fa',
31
@@ -XXX,XX +XXX,XX @@ out:
33
+ build_by_default: false)
32
*wp = offset + bytes;
34
+
33
}
35
+blockdev = declare_dependency(link_whole: [libblockdev],
34
}
36
+ dependencies: [block])
35
- }
37
+
36
- } else {
38
qmp_ss = qmp_ss.apply(config_host, strict: false)
37
- if ((type & (QEMU_AIO_WRITE | QEMU_AIO_ZONE_APPEND)) &&
39
libqmp = static_library('qmp', qmp_ss.sources() + genh,
38
- bs->bl.zoned != BLK_Z_NONE) {
40
dependencies: qmp_ss.dependencies(),
39
+ } else {
41
@@ -XXX,XX +XXX,XX @@ foreach m : block_mods + softmmu_mods
40
update_zones_wp(bs, s->fd, 0, 1);
42
install_dir: config_host['qemu_moddir'])
41
}
43
endforeach
42
- }
44
43
45
-softmmu_ss.add(authz, block, chardev, crypto, io, qmp)
44
- if ((type & (QEMU_AIO_WRITE | QEMU_AIO_ZONE_APPEND)) &&
46
+softmmu_ss.add(authz, blockdev, chardev, crypto, io, qmp)
45
- bs->blk.zoned != BLK_Z_NONE) {
47
common_ss.add(qom, qemuutil)
46
qemu_co_mutex_unlock(&wps->colock);
48
47
}
49
common_ss.add_all(when: 'CONFIG_SOFTMMU', if_true: [softmmu_ss])
48
-}
50
diff --git a/storage-daemon/meson.build b/storage-daemon/meson.build
49
#endif
51
index XXXXXXX..XXXXXXX 100644
50
return ret;
52
--- a/storage-daemon/meson.build
51
}
53
+++ b/storage-daemon/meson.build
54
@@ -XXX,XX +XXX,XX @@
55
qsd_ss = ss.source_set()
56
qsd_ss.add(files('qemu-storage-daemon.c'))
57
-qsd_ss.add(block, chardev, qmp, qom, qemuutil)
58
-qsd_ss.add_all(blockdev_ss)
59
+qsd_ss.add(blockdev, chardev, qmp, qom, qemuutil)
60
61
subdir('qapi')
62
52
--
63
--
53
2.41.0
64
2.26.2
65
diff view generated by jsdifflib
1
This is a regression test for
1
Block exports are used by softmmu, qemu-storage-daemon, and qemu-nbd.
2
https://bugzilla.redhat.com/show_bug.cgi?id=2234374.
2
They are not used by other programs and are not otherwise needed in
3
libblock.
3
4
4
All this test needs to do is trigger an I/O error inside of file-posix
5
Undo the recent move of blockdev-nbd.c from blockdev_ss into block_ss.
5
(specifically raw_co_prw()). One reliable way to do this without
6
Since bdrv_close_all() (libblock) calls blk_exp_close_all()
6
requiring special privileges is to use a FUSE export, which allows us to
7
(libblockdev) a stub function is required..
7
inject any error that we want, e.g. via blkdebug.
8
8
9
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
9
Make qemu-nbd.c use signal handling utility functions instead of
10
Message-Id: <20230824155345.109765-6-hreitz@redhat.com>
10
duplicating the code. This helps because os-posix.c is in libblockdev
11
[hreitz: Fixed test to be skipped when there is no FUSE support, to
11
and it depends on a qemu_system_killed() symbol that qemu-nbd.c lacks.
12
suppress fusermount's allow_other warning, and to be skipped
12
Once we use the signal handling utility functions we also end up
13
with $IMGOPTSSYNTAX enabled]
13
providing the necessary symbol.
14
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
14
15
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
16
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
17
Reviewed-by: Eric Blake <eblake@redhat.com>
18
Message-id: 20200929125516.186715-4-stefanha@redhat.com
19
[Fixed s/ndb/nbd/ typo in commit description as suggested by Eric Blake
20
--Stefan]
21
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
15
---
22
---
16
tests/qemu-iotests/tests/file-io-error | 119 +++++++++++++++++++++
23
qemu-nbd.c | 21 ++++++++-------------
17
tests/qemu-iotests/tests/file-io-error.out | 33 ++++++
24
stubs/blk-exp-close-all.c | 7 +++++++
18
2 files changed, 152 insertions(+)
25
block/export/meson.build | 4 ++--
19
create mode 100755 tests/qemu-iotests/tests/file-io-error
26
meson.build | 4 ++--
20
create mode 100644 tests/qemu-iotests/tests/file-io-error.out
27
nbd/meson.build | 2 ++
28
stubs/meson.build | 1 +
29
6 files changed, 22 insertions(+), 17 deletions(-)
30
create mode 100644 stubs/blk-exp-close-all.c
21
31
22
diff --git a/tests/qemu-iotests/tests/file-io-error b/tests/qemu-iotests/tests/file-io-error
32
diff --git a/qemu-nbd.c b/qemu-nbd.c
23
new file mode 100755
33
index XXXXXXX..XXXXXXX 100644
24
index XXXXXXX..XXXXXXX
34
--- a/qemu-nbd.c
25
--- /dev/null
35
+++ b/qemu-nbd.c
26
+++ b/tests/qemu-iotests/tests/file-io-error
27
@@ -XXX,XX +XXX,XX @@
36
@@ -XXX,XX +XXX,XX @@
28
+#!/usr/bin/env bash
37
#include "qapi/error.h"
29
+# group: rw
38
#include "qemu/cutils.h"
30
+#
39
#include "sysemu/block-backend.h"
31
+# Produce an I/O error in file-posix, and hope that it is not catastrophic.
40
+#include "sysemu/runstate.h" /* for qemu_system_killed() prototype */
32
+# Regression test for: https://bugzilla.redhat.com/show_bug.cgi?id=2234374
41
#include "block/block_int.h"
33
+#
42
#include "block/nbd.h"
34
+# Copyright (C) 2023 Red Hat, Inc.
43
#include "qemu/main-loop.h"
35
+#
44
@@ -XXX,XX +XXX,XX @@ QEMU_COPYRIGHT "\n"
36
+# This program is free software; you can redistribute it and/or modify
45
}
37
+# it under the terms of the GNU General Public License as published by
46
38
+# the Free Software Foundation; either version 2 of the License, or
47
#ifdef CONFIG_POSIX
39
+# (at your option) any later version.
48
-static void termsig_handler(int signum)
40
+#
49
+/*
41
+# This program is distributed in the hope that it will be useful,
50
+ * The client thread uses SIGTERM to interrupt the server. A signal
42
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
51
+ * handler ensures that "qemu-nbd -v -c" exits with a nice status code.
43
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
52
+ */
44
+# GNU General Public License for more details.
53
+void qemu_system_killed(int signum, pid_t pid)
45
+#
54
{
46
+# You should have received a copy of the GNU General Public License
55
qatomic_cmpxchg(&state, RUNNING, TERMINATE);
47
+# along with this program. If not, see <http://www.gnu.org/licenses/>.
56
qemu_notify_event();
48
+#
57
@@ -XXX,XX +XXX,XX @@ int main(int argc, char **argv)
49
+
58
BlockExportOptions *export_opts;
50
+seq=$(basename "$0")
59
51
+echo "QA output created by $seq"
60
#ifdef CONFIG_POSIX
52
+
61
- /*
53
+status=1    # failure is the default!
62
- * Exit gracefully on various signals, which includes SIGTERM used
54
+
63
- * by 'qemu-nbd -v -c'.
55
+_cleanup()
64
- */
56
+{
65
- struct sigaction sa_sigterm;
57
+ _cleanup_qemu
66
- memset(&sa_sigterm, 0, sizeof(sa_sigterm));
58
+ rm -f "$TEST_DIR/fuse-export"
67
- sa_sigterm.sa_handler = termsig_handler;
59
+}
68
- sigaction(SIGTERM, &sa_sigterm, NULL);
60
+trap "_cleanup; exit \$status" 0 1 2 3 15
69
- sigaction(SIGINT, &sa_sigterm, NULL);
61
+
70
- sigaction(SIGHUP, &sa_sigterm, NULL);
62
+# get standard environment, filters and checks
71
-
63
+. ../common.rc
72
- signal(SIGPIPE, SIG_IGN);
64
+. ../common.filter
73
+ os_setup_early_signal_handling();
65
+. ../common.qemu
74
+ os_setup_signal_handling();
66
+
75
#endif
67
+# Format-agnostic (we do not use any), but we do test the file protocol
76
68
+_supported_proto file
77
socket_init();
69
+_require_drivers blkdebug null-co
78
diff --git a/stubs/blk-exp-close-all.c b/stubs/blk-exp-close-all.c
70
+
71
+if [ "$IMGOPTSSYNTAX" = "true" ]; then
72
+ # We need `$QEMU_IO -f file` to work; IMGOPTSSYNTAX uses --image-opts,
73
+ # breaking -f.
74
+ _unsupported_fmt $IMGFMT
75
+fi
76
+
77
+# This is a regression test of a bug in which flie-posix would access zone
78
+# information in case of an I/O error even when there is no zone information,
79
+# resulting in a division by zero.
80
+# To reproduce the problem, we need to trigger an I/O error inside of
81
+# file-posix, which can be done (rootless) by providing a FUSE export that
82
+# presents only errors when accessed.
83
+
84
+_launch_qemu
85
+_send_qemu_cmd $QEMU_HANDLE \
86
+ "{'execute': 'qmp_capabilities'}" \
87
+ 'return'
88
+
89
+_send_qemu_cmd $QEMU_HANDLE \
90
+ "{'execute': 'blockdev-add',
91
+ 'arguments': {
92
+ 'driver': 'blkdebug',
93
+ 'node-name': 'node0',
94
+ 'inject-error': [{'event': 'none'}],
95
+ 'image': {
96
+ 'driver': 'null-co'
97
+ }
98
+ }}" \
99
+ 'return'
100
+
101
+# FUSE mountpoint must exist and be a regular file
102
+touch "$TEST_DIR/fuse-export"
103
+
104
+# The grep -v to filter fusermount's (benign) error when /etc/fuse.conf does
105
+# not contain user_allow_other and the subsequent check for missing FUSE support
106
+# have both been taken from iotest 308.
107
+output=$(_send_qemu_cmd $QEMU_HANDLE \
108
+ "{'execute': 'block-export-add',
109
+ 'arguments': {
110
+ 'id': 'exp0',
111
+ 'type': 'fuse',
112
+ 'node-name': 'node0',
113
+ 'mountpoint': '$TEST_DIR/fuse-export',
114
+ 'writable': true
115
+ }}" \
116
+ 'return' \
117
+ | grep -v 'option allow_other only allowed if')
118
+
119
+if echo "$output" | grep -q "Parameter 'type' does not accept value 'fuse'"; then
120
+ _notrun 'No FUSE support'
121
+fi
122
+echo "$output"
123
+
124
+echo
125
+# This should fail, but gracefully, i.e. just print an I/O error, not crash.
126
+$QEMU_IO -f file -c 'write 0 64M' "$TEST_DIR/fuse-export" | _filter_qemu_io
127
+echo
128
+
129
+_send_qemu_cmd $QEMU_HANDLE \
130
+ "{'execute': 'block-export-del',
131
+ 'arguments': {'id': 'exp0'}}" \
132
+ 'return'
133
+
134
+_send_qemu_cmd $QEMU_HANDLE \
135
+ '' \
136
+ 'BLOCK_EXPORT_DELETED'
137
+
138
+_send_qemu_cmd $QEMU_HANDLE \
139
+ "{'execute': 'blockdev-del',
140
+ 'arguments': {'node-name': 'node0'}}" \
141
+ 'return'
142
+
143
+# success, all done
144
+echo "*** done"
145
+rm -f $seq.full
146
+status=0
147
diff --git a/tests/qemu-iotests/tests/file-io-error.out b/tests/qemu-iotests/tests/file-io-error.out
148
new file mode 100644
79
new file mode 100644
149
index XXXXXXX..XXXXXXX
80
index XXXXXXX..XXXXXXX
150
--- /dev/null
81
--- /dev/null
151
+++ b/tests/qemu-iotests/tests/file-io-error.out
82
+++ b/stubs/blk-exp-close-all.c
152
@@ -XXX,XX +XXX,XX @@
83
@@ -XXX,XX +XXX,XX @@
153
+QA output created by file-io-error
84
+#include "qemu/osdep.h"
154
+{'execute': 'qmp_capabilities'}
85
+#include "block/export.h"
155
+{"return": {}}
156
+{'execute': 'blockdev-add',
157
+ 'arguments': {
158
+ 'driver': 'blkdebug',
159
+ 'node-name': 'node0',
160
+ 'inject-error': [{'event': 'none'}],
161
+ 'image': {
162
+ 'driver': 'null-co'
163
+ }
164
+ }}
165
+{"return": {}}
166
+{'execute': 'block-export-add',
167
+ 'arguments': {
168
+ 'id': 'exp0',
169
+ 'type': 'fuse',
170
+ 'node-name': 'node0',
171
+ 'mountpoint': 'TEST_DIR/fuse-export',
172
+ 'writable': true
173
+ }}
174
+{"return": {}}
175
+
86
+
176
+write failed: Input/output error
87
+/* Only used in programs that support block exports (libblockdev.fa) */
177
+
88
+void blk_exp_close_all(void)
178
+{'execute': 'block-export-del',
89
+{
179
+ 'arguments': {'id': 'exp0'}}
90
+}
180
+{"return": {}}
91
diff --git a/block/export/meson.build b/block/export/meson.build
181
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "BLOCK_EXPORT_DELETED", "data": {"id": "exp0"}}
92
index XXXXXXX..XXXXXXX 100644
182
+{'execute': 'blockdev-del',
93
--- a/block/export/meson.build
183
+ 'arguments': {'node-name': 'node0'}}
94
+++ b/block/export/meson.build
184
+{"return": {}}
95
@@ -XXX,XX +XXX,XX @@
185
+*** done
96
-block_ss.add(files('export.c'))
97
-block_ss.add(when: ['CONFIG_LINUX', 'CONFIG_VHOST_USER'], if_true: files('vhost-user-blk-server.c'))
98
+blockdev_ss.add(files('export.c'))
99
+blockdev_ss.add(when: ['CONFIG_LINUX', 'CONFIG_VHOST_USER'], if_true: files('vhost-user-blk-server.c'))
100
diff --git a/meson.build b/meson.build
101
index XXXXXXX..XXXXXXX 100644
102
--- a/meson.build
103
+++ b/meson.build
104
@@ -XXX,XX +XXX,XX @@ subdir('dump')
105
106
block_ss.add(files(
107
'block.c',
108
- 'blockdev-nbd.c',
109
'blockjob.c',
110
'job.c',
111
'qemu-io-cmds.c',
112
@@ -XXX,XX +XXX,XX @@ subdir('block')
113
114
blockdev_ss.add(files(
115
'blockdev.c',
116
+ 'blockdev-nbd.c',
117
'iothread.c',
118
'job-qmp.c',
119
))
120
@@ -XXX,XX +XXX,XX @@ if have_tools
121
qemu_io = executable('qemu-io', files('qemu-io.c'),
122
dependencies: [block, qemuutil], install: true)
123
qemu_nbd = executable('qemu-nbd', files('qemu-nbd.c'),
124
- dependencies: [block, qemuutil], install: true)
125
+ dependencies: [blockdev, qemuutil], install: true)
126
127
subdir('storage-daemon')
128
subdir('contrib/rdmacm-mux')
129
diff --git a/nbd/meson.build b/nbd/meson.build
130
index XXXXXXX..XXXXXXX 100644
131
--- a/nbd/meson.build
132
+++ b/nbd/meson.build
133
@@ -XXX,XX +XXX,XX @@
134
block_ss.add(files(
135
'client.c',
136
'common.c',
137
+))
138
+blockdev_ss.add(files(
139
'server.c',
140
))
141
diff --git a/stubs/meson.build b/stubs/meson.build
142
index XXXXXXX..XXXXXXX 100644
143
--- a/stubs/meson.build
144
+++ b/stubs/meson.build
145
@@ -XXX,XX +XXX,XX @@
146
stub_ss.add(files('arch_type.c'))
147
stub_ss.add(files('bdrv-next-monitor-owned.c'))
148
stub_ss.add(files('blk-commit-all.c'))
149
+stub_ss.add(files('blk-exp-close-all.c'))
150
stub_ss.add(files('blockdev-close-all-bdrv-states.c'))
151
stub_ss.add(files('change-state-handler.c'))
152
stub_ss.add(files('cmos.c'))
186
--
153
--
187
2.41.0
154
2.26.2
155
diff view generated by jsdifflib
1
From: zhenwei pi <pizhenwei@bytedance.com>
1
Make it possible to specify the iothread where the export will run. By
2
default the block node can be moved to other AioContexts later and the
3
export will follow. The fixed-iothread option forces strict behavior
4
that prevents changing AioContext while the export is active. See the
5
QAPI docs for details.
2
6
3
Reviewed-by: Alberto Garcia <berto@igalia.com>
7
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
4
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
8
Message-id: 20200929125516.186715-5-stefanha@redhat.com
5
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
9
[Fix stray '#' character in block-export.json and add missing "(since:
6
Message-Id: <20230728022006.1098509-5-pizhenwei@bytedance.com>
10
5.2)" as suggested by Eric Blake.
7
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
11
--Stefan]
12
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
8
---
13
---
9
tests/unit/test-throttle.c | 66 ++++++++++++++++++++++++++++++++++++++
14
qapi/block-export.json | 11 ++++++++++
10
1 file changed, 66 insertions(+)
15
block/export/export.c | 31 +++++++++++++++++++++++++++-
16
block/export/vhost-user-blk-server.c | 5 ++++-
17
nbd/server.c | 2 --
18
4 files changed, 45 insertions(+), 4 deletions(-)
11
19
12
diff --git a/tests/unit/test-throttle.c b/tests/unit/test-throttle.c
20
diff --git a/qapi/block-export.json b/qapi/block-export.json
13
index XXXXXXX..XXXXXXX 100644
21
index XXXXXXX..XXXXXXX 100644
14
--- a/tests/unit/test-throttle.c
22
--- a/qapi/block-export.json
15
+++ b/tests/unit/test-throttle.c
23
+++ b/qapi/block-export.json
16
@@ -XXX,XX +XXX,XX @@ static void test_init(void)
24
@@ -XXX,XX +XXX,XX @@
17
throttle_timers_destroy(tt);
25
# export before completion is signalled. (since: 5.2;
18
}
26
# default: false)
19
27
#
20
+static void test_init_readonly(void)
28
+# @iothread: The name of the iothread object where the export will run. The
21
+{
29
+# default is to use the thread currently associated with the
22
+ int i;
30
+# block node. (since: 5.2)
31
+#
32
+# @fixed-iothread: True prevents the block node from being moved to another
33
+# thread while the export is active. If true and @iothread is
34
+# given, export creation fails if the block node cannot be
35
+# moved to the iothread. The default is false. (since: 5.2)
36
+#
37
# Since: 4.2
38
##
39
{ 'union': 'BlockExportOptions',
40
'base': { 'type': 'BlockExportType',
41
'id': 'str',
42
+     '*fixed-iothread': 'bool',
43
+     '*iothread': 'str',
44
'node-name': 'str',
45
'*writable': 'bool',
46
'*writethrough': 'bool' },
47
diff --git a/block/export/export.c b/block/export/export.c
48
index XXXXXXX..XXXXXXX 100644
49
--- a/block/export/export.c
50
+++ b/block/export/export.c
51
@@ -XXX,XX +XXX,XX @@
52
53
#include "block/block.h"
54
#include "sysemu/block-backend.h"
55
+#include "sysemu/iothread.h"
56
#include "block/export.h"
57
#include "block/nbd.h"
58
#include "qapi/error.h"
59
@@ -XXX,XX +XXX,XX @@ static const BlockExportDriver *blk_exp_find_driver(BlockExportType type)
60
61
BlockExport *blk_exp_add(BlockExportOptions *export, Error **errp)
62
{
63
+ bool fixed_iothread = export->has_fixed_iothread && export->fixed_iothread;
64
const BlockExportDriver *drv;
65
BlockExport *exp = NULL;
66
BlockDriverState *bs;
67
- BlockBackend *blk;
68
+ BlockBackend *blk = NULL;
69
AioContext *ctx;
70
uint64_t perm;
71
int ret;
72
@@ -XXX,XX +XXX,XX @@ BlockExport *blk_exp_add(BlockExportOptions *export, Error **errp)
73
ctx = bdrv_get_aio_context(bs);
74
aio_context_acquire(ctx);
75
76
+ if (export->has_iothread) {
77
+ IOThread *iothread;
78
+ AioContext *new_ctx;
23
+
79
+
24
+ tt = &tgm.throttle_timers;
80
+ iothread = iothread_by_id(export->iothread);
81
+ if (!iothread) {
82
+ error_setg(errp, "iothread \"%s\" not found", export->iothread);
83
+ goto fail;
84
+ }
25
+
85
+
26
+ /* fill the structures with crap */
86
+ new_ctx = iothread_get_aio_context(iothread);
27
+ memset(&ts, 1, sizeof(ts));
28
+ memset(tt, 1, sizeof(*tt));
29
+
87
+
30
+ /* init structures */
88
+ ret = bdrv_try_set_aio_context(bs, new_ctx, errp);
31
+ throttle_init(&ts);
89
+ if (ret == 0) {
32
+ throttle_timers_init(tt, ctx, QEMU_CLOCK_VIRTUAL,
90
+ aio_context_release(ctx);
33
+ read_timer_cb, NULL, &ts);
91
+ aio_context_acquire(new_ctx);
34
+
92
+ ctx = new_ctx;
35
+ /* check initialized fields */
93
+ } else if (fixed_iothread) {
36
+ g_assert(tt->clock_type == QEMU_CLOCK_VIRTUAL);
94
+ goto fail;
37
+ g_assert(tt->timers[THROTTLE_READ]);
95
+ }
38
+ g_assert(!tt->timers[THROTTLE_WRITE]);
39
+
40
+ /* check other fields where cleared */
41
+ g_assert(!ts.previous_leak);
42
+ g_assert(!ts.cfg.op_size);
43
+ for (i = 0; i < BUCKETS_COUNT; i++) {
44
+ g_assert(!ts.cfg.buckets[i].avg);
45
+ g_assert(!ts.cfg.buckets[i].max);
46
+ g_assert(!ts.cfg.buckets[i].level);
47
+ }
96
+ }
48
+
97
+
49
+ throttle_timers_destroy(tt);
98
/*
50
+}
99
* Block exports are used for non-shared storage migration. Make sure
100
* that BDRV_O_INACTIVE is cleared and the image is ready for write
101
@@ -XXX,XX +XXX,XX @@ BlockExport *blk_exp_add(BlockExportOptions *export, Error **errp)
102
}
103
104
blk = blk_new(ctx, perm, BLK_PERM_ALL);
51
+
105
+
52
+static void test_init_writeonly(void)
106
+ if (!fixed_iothread) {
53
+{
107
+ blk_set_allow_aio_context_change(blk, true);
54
+ int i;
55
+
56
+ tt = &tgm.throttle_timers;
57
+
58
+ /* fill the structures with crap */
59
+ memset(&ts, 1, sizeof(ts));
60
+ memset(tt, 1, sizeof(*tt));
61
+
62
+ /* init structures */
63
+ throttle_init(&ts);
64
+ throttle_timers_init(tt, ctx, QEMU_CLOCK_VIRTUAL,
65
+ NULL, write_timer_cb, &ts);
66
+
67
+ /* check initialized fields */
68
+ g_assert(tt->clock_type == QEMU_CLOCK_VIRTUAL);
69
+ g_assert(!tt->timers[THROTTLE_READ]);
70
+ g_assert(tt->timers[THROTTLE_WRITE]);
71
+
72
+ /* check other fields where cleared */
73
+ g_assert(!ts.previous_leak);
74
+ g_assert(!ts.cfg.op_size);
75
+ for (i = 0; i < BUCKETS_COUNT; i++) {
76
+ g_assert(!ts.cfg.buckets[i].avg);
77
+ g_assert(!ts.cfg.buckets[i].max);
78
+ g_assert(!ts.cfg.buckets[i].level);
79
+ }
108
+ }
80
+
109
+
81
+ throttle_timers_destroy(tt);
110
ret = blk_insert_bs(blk, bs, errp);
82
+}
111
if (ret < 0) {
112
goto fail;
113
diff --git a/block/export/vhost-user-blk-server.c b/block/export/vhost-user-blk-server.c
114
index XXXXXXX..XXXXXXX 100644
115
--- a/block/export/vhost-user-blk-server.c
116
+++ b/block/export/vhost-user-blk-server.c
117
@@ -XXX,XX +XXX,XX @@ static const VuDevIface vu_blk_iface = {
118
static void blk_aio_attached(AioContext *ctx, void *opaque)
119
{
120
VuBlkExport *vexp = opaque;
83
+
121
+
84
static void test_destroy(void)
122
+ vexp->export.ctx = ctx;
123
vhost_user_server_attach_aio_context(&vexp->vu_server, ctx);
124
}
125
126
static void blk_aio_detach(void *opaque)
85
{
127
{
86
int i;
128
VuBlkExport *vexp = opaque;
87
@@ -XXX,XX +XXX,XX @@ int main(int argc, char **argv)
129
+
88
g_test_add_func("/throttle/leak_bucket", test_leak_bucket);
130
vhost_user_server_detach_aio_context(&vexp->vu_server);
89
g_test_add_func("/throttle/compute_wait", test_compute_wait);
131
+ vexp->export.ctx = NULL;
90
g_test_add_func("/throttle/init", test_init);
132
}
91
+ g_test_add_func("/throttle/init_readonly", test_init_readonly);
133
92
+ g_test_add_func("/throttle/init_writeonly", test_init_writeonly);
134
static void
93
g_test_add_func("/throttle/destroy", test_destroy);
135
@@ -XXX,XX +XXX,XX @@ static int vu_blk_exp_create(BlockExport *exp, BlockExportOptions *opts,
94
g_test_add_func("/throttle/have_timer", test_have_timer);
136
vu_blk_initialize_config(blk_bs(exp->blk), &vexp->blkcfg,
95
g_test_add_func("/throttle/detach_attach", test_detach_attach);
137
logical_block_size);
138
139
- blk_set_allow_aio_context_change(exp->blk, true);
140
blk_add_aio_context_notifier(exp->blk, blk_aio_attached, blk_aio_detach,
141
vexp);
142
143
diff --git a/nbd/server.c b/nbd/server.c
144
index XXXXXXX..XXXXXXX 100644
145
--- a/nbd/server.c
146
+++ b/nbd/server.c
147
@@ -XXX,XX +XXX,XX @@ static int nbd_export_create(BlockExport *blk_exp, BlockExportOptions *exp_args,
148
return ret;
149
}
150
151
- blk_set_allow_aio_context_change(blk, true);
152
-
153
QTAILQ_INIT(&exp->clients);
154
exp->name = g_strdup(arg->name);
155
exp->description = g_strdup(arg->description);
96
--
156
--
97
2.41.0
157
2.26.2
158
diff view generated by jsdifflib
1
bs->bl.zoned is what indicates whether the zone information is present
1
Allow the number of queues to be configured using --export
2
and valid; it is the only thing that raw_refresh_zoned_limits() sets if
2
vhost-user-blk,num-queues=N. This setting should match the QEMU --device
3
CONFIG_BLKZONED is not defined, and it is also the only thing that it
3
vhost-user-blk-pci,num-queues=N setting but QEMU vhost-user-blk.c lowers
4
sets if CONFIG_BLKZONED is defined, but there are no zones.
4
its own value if the vhost-user-blk backend offers fewer queues than
5
QEMU.
5
6
6
Make sure that it is always set to BLK_Z_NONE if there is an error
7
The vhost-user-blk-server.c code is already capable of multi-queue. All
7
anywhere in raw_refresh_zoned_limits() so that we do not accidentally
8
virtqueue processing runs in the same AioContext. No new locking is
8
announce zones while our information is incomplete or invalid.
9
needed.
9
10
10
This also fixes a memory leak in the last error path in
11
Add the num-queues=N option and set the VIRTIO_BLK_F_MQ feature bit.
11
raw_refresh_zoned_limits().
12
Note that the feature bit only announces the presence of the num_queues
13
configuration space field. It does not promise that there is more than 1
14
virtqueue, so we can set it unconditionally.
12
15
13
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
16
I tested multi-queue by running a random read fio test with numjobs=4 on
14
Message-Id: <20230824155345.109765-2-hreitz@redhat.com>
17
an -smp 4 guest. After the benchmark finished the guest /proc/interrupts
15
Reviewed-by: Sam Li <faithilikerun@gmail.com>
18
file showed activity on all 4 virtio-blk MSI-X. The /sys/block/vda/mq/
19
directory shows that Linux blk-mq has 4 queues configured.
20
21
An automated test is included in the next commit.
22
23
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
24
Acked-by: Markus Armbruster <armbru@redhat.com>
25
Message-id: 20201001144604.559733-2-stefanha@redhat.com
26
[Fixed accidental tab characters as suggested by Markus Armbruster
27
--Stefan]
28
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
16
---
29
---
17
block/file-posix.c | 21 ++++++++++++---------
30
qapi/block-export.json | 10 +++++++---
18
1 file changed, 12 insertions(+), 9 deletions(-)
31
block/export/vhost-user-blk-server.c | 24 ++++++++++++++++++------
32
2 files changed, 25 insertions(+), 9 deletions(-)
19
33
20
diff --git a/block/file-posix.c b/block/file-posix.c
34
diff --git a/qapi/block-export.json b/qapi/block-export.json
21
index XXXXXXX..XXXXXXX 100644
35
index XXXXXXX..XXXXXXX 100644
22
--- a/block/file-posix.c
36
--- a/qapi/block-export.json
23
+++ b/block/file-posix.c
37
+++ b/qapi/block-export.json
24
@@ -XXX,XX +XXX,XX @@ static void raw_refresh_zoned_limits(BlockDriverState *bs, struct stat *st,
38
@@ -XXX,XX +XXX,XX @@
25
BlockZoneModel zoned;
39
# SocketAddress types are supported. Passed fds must be UNIX domain
26
int ret;
40
# sockets.
27
41
# @logical-block-size: Logical block size in bytes. Defaults to 512 bytes.
28
- bs->bl.zoned = BLK_Z_NONE;
42
+# @num-queues: Number of request virtqueues. Must be greater than 0. Defaults
29
-
43
+# to 1.
30
ret = get_sysfs_zoned_model(st, &zoned);
44
#
31
if (ret < 0 || zoned == BLK_Z_NONE) {
45
# Since: 5.2
32
- return;
46
##
33
+ goto no_zoned;
47
{ 'struct': 'BlockExportOptionsVhostUserBlk',
48
- 'data': { 'addr': 'SocketAddress', '*logical-block-size': 'size' } }
49
+ 'data': { 'addr': 'SocketAddress',
50
+     '*logical-block-size': 'size',
51
+ '*num-queues': 'uint16'} }
52
53
##
54
# @NbdServerAddOptions:
55
@@ -XXX,XX +XXX,XX @@
56
{ 'union': 'BlockExportOptions',
57
'base': { 'type': 'BlockExportType',
58
'id': 'str',
59
-     '*fixed-iothread': 'bool',
60
-     '*iothread': 'str',
61
+ '*fixed-iothread': 'bool',
62
+ '*iothread': 'str',
63
'node-name': 'str',
64
'*writable': 'bool',
65
'*writethrough': 'bool' },
66
diff --git a/block/export/vhost-user-blk-server.c b/block/export/vhost-user-blk-server.c
67
index XXXXXXX..XXXXXXX 100644
68
--- a/block/export/vhost-user-blk-server.c
69
+++ b/block/export/vhost-user-blk-server.c
70
@@ -XXX,XX +XXX,XX @@
71
#include "util/block-helpers.h"
72
73
enum {
74
- VHOST_USER_BLK_MAX_QUEUES = 1,
75
+ VHOST_USER_BLK_NUM_QUEUES_DEFAULT = 1,
76
};
77
struct virtio_blk_inhdr {
78
unsigned char status;
79
@@ -XXX,XX +XXX,XX @@ static uint64_t vu_blk_get_features(VuDev *dev)
80
1ull << VIRTIO_BLK_F_DISCARD |
81
1ull << VIRTIO_BLK_F_WRITE_ZEROES |
82
1ull << VIRTIO_BLK_F_CONFIG_WCE |
83
+ 1ull << VIRTIO_BLK_F_MQ |
84
1ull << VIRTIO_F_VERSION_1 |
85
1ull << VIRTIO_RING_F_INDIRECT_DESC |
86
1ull << VIRTIO_RING_F_EVENT_IDX |
87
@@ -XXX,XX +XXX,XX @@ static void blk_aio_detach(void *opaque)
88
89
static void
90
vu_blk_initialize_config(BlockDriverState *bs,
91
- struct virtio_blk_config *config, uint32_t blk_size)
92
+ struct virtio_blk_config *config,
93
+ uint32_t blk_size,
94
+ uint16_t num_queues)
95
{
96
config->capacity = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
97
config->blk_size = blk_size;
98
@@ -XXX,XX +XXX,XX @@ vu_blk_initialize_config(BlockDriverState *bs,
99
config->seg_max = 128 - 2;
100
config->min_io_size = 1;
101
config->opt_io_size = 1;
102
- config->num_queues = VHOST_USER_BLK_MAX_QUEUES;
103
+ config->num_queues = num_queues;
104
config->max_discard_sectors = 32768;
105
config->max_discard_seg = 1;
106
config->discard_sector_alignment = config->blk_size >> 9;
107
@@ -XXX,XX +XXX,XX @@ static int vu_blk_exp_create(BlockExport *exp, BlockExportOptions *opts,
108
BlockExportOptionsVhostUserBlk *vu_opts = &opts->u.vhost_user_blk;
109
Error *local_err = NULL;
110
uint64_t logical_block_size;
111
+ uint16_t num_queues = VHOST_USER_BLK_NUM_QUEUES_DEFAULT;
112
113
vexp->writable = opts->writable;
114
vexp->blkcfg.wce = 0;
115
@@ -XXX,XX +XXX,XX @@ static int vu_blk_exp_create(BlockExport *exp, BlockExportOptions *opts,
34
}
116
}
35
bs->bl.zoned = zoned;
117
vexp->blk_size = logical_block_size;
36
118
blk_set_guest_block_size(exp->blk, logical_block_size);
37
@@ -XXX,XX +XXX,XX @@ static void raw_refresh_zoned_limits(BlockDriverState *bs, struct stat *st,
38
if (ret < 0) {
39
error_setg_errno(errp, -ret, "Unable to read chunk_sectors "
40
"sysfs attribute");
41
- return;
42
+ goto no_zoned;
43
} else if (!ret) {
44
error_setg(errp, "Read 0 from chunk_sectors sysfs attribute");
45
- return;
46
+ goto no_zoned;
47
}
48
bs->bl.zone_size = ret << BDRV_SECTOR_BITS;
49
50
@@ -XXX,XX +XXX,XX @@ static void raw_refresh_zoned_limits(BlockDriverState *bs, struct stat *st,
51
if (ret < 0) {
52
error_setg_errno(errp, -ret, "Unable to read nr_zones "
53
"sysfs attribute");
54
- return;
55
+ goto no_zoned;
56
} else if (!ret) {
57
error_setg(errp, "Read 0 from nr_zones sysfs attribute");
58
- return;
59
+ goto no_zoned;
60
}
61
bs->bl.nr_zones = ret;
62
63
@@ -XXX,XX +XXX,XX @@ static void raw_refresh_zoned_limits(BlockDriverState *bs, struct stat *st,
64
ret = get_zones_wp(bs, s->fd, 0, bs->bl.nr_zones, 0);
65
if (ret < 0) {
66
error_setg_errno(errp, -ret, "report wps failed");
67
- bs->wps = NULL;
68
- return;
69
+ goto no_zoned;
70
}
71
qemu_co_mutex_init(&bs->wps->colock);
72
+ return;
73
+
119
+
74
+no_zoned:
120
+ if (vu_opts->has_num_queues) {
75
+ bs->bl.zoned = BLK_Z_NONE;
121
+ num_queues = vu_opts->num_queues;
76
+ g_free(bs->wps);
122
+ }
77
+ bs->wps = NULL;
123
+ if (num_queues == 0) {
78
}
124
+ error_setg(errp, "num-queues must be greater than 0");
79
#else /* !defined(CONFIG_BLKZONED) */
125
+ return -EINVAL;
80
static void raw_refresh_zoned_limits(BlockDriverState *bs, struct stat *st,
126
+ }
127
+
128
vu_blk_initialize_config(blk_bs(exp->blk), &vexp->blkcfg,
129
- logical_block_size);
130
+ logical_block_size, num_queues);
131
132
blk_add_aio_context_notifier(exp->blk, blk_aio_attached, blk_aio_detach,
133
vexp);
134
135
if (!vhost_user_server_start(&vexp->vu_server, vu_opts->addr, exp->ctx,
136
- VHOST_USER_BLK_MAX_QUEUES, &vu_blk_iface,
137
- errp)) {
138
+ num_queues, &vu_blk_iface, errp)) {
139
blk_remove_aio_context_notifier(exp->blk, blk_aio_attached,
140
blk_aio_detach, vexp);
141
return -EADDRNOTAVAIL;
81
--
142
--
82
2.41.0
143
2.26.2
144
diff view generated by jsdifflib
1
From: zhenwei pi <pizhenwei@bytedance.com>
1
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2
2
3
Only one direction is necessary in several scenarios:
3
bdrv_co_block_status_above has several design problems with handling
4
- a read-only disk
4
short backing files:
5
- operations on a device are considered as *write* only. For example,
6
encrypt/decrypt/sign/verify operations on a cryptodev use a single
7
*write* timer(read timer callback is defined, but never invoked).
8
5
9
Allow a single direction in throttle, this reduces memory, and uplayer
6
1. With want_zeros=true, it may return ret with BDRV_BLOCK_ZERO but
10
does not need a dummy callback any more.
7
without BDRV_BLOCK_ALLOCATED flag, when actually short backing file
8
which produces these after-EOF zeros is inside requested backing
9
sequence.
11
10
11
2. With want_zero=false, it may return pnum=0 prior to actual EOF,
12
because of EOF of short backing file.
13
14
Fix these things, making logic about short backing files clearer.
15
16
With fixed bdrv_block_status_above we also have to improve is_zero in
17
qcow2 code, otherwise iotest 154 will fail, because with this patch we
18
stop to merge zeros of different types (produced by fully unallocated
19
in the whole backing chain regions vs produced by short backing files).
20
21
Note also, that this patch leaves for another day the general problem
22
around block-status: misuse of BDRV_BLOCK_ALLOCATED as is-fs-allocated
23
vs go-to-backing.
24
25
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
12
Reviewed-by: Alberto Garcia <berto@igalia.com>
26
Reviewed-by: Alberto Garcia <berto@igalia.com>
13
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
27
Reviewed-by: Eric Blake <eblake@redhat.com>
14
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
28
Message-id: 20200924194003.22080-2-vsementsov@virtuozzo.com
15
Message-Id: <20230728022006.1098509-4-pizhenwei@bytedance.com>
29
[Fix s/comes/come/ as suggested by Eric Blake
16
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
30
--Stefan]
31
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
17
---
32
---
18
util/throttle.c | 42 ++++++++++++++++++++++++++++--------------
33
block/io.c | 68 ++++++++++++++++++++++++++++++++++++++++-----------
19
1 file changed, 28 insertions(+), 14 deletions(-)
34
block/qcow2.c | 16 ++++++++++--
35
2 files changed, 68 insertions(+), 16 deletions(-)
20
36
21
diff --git a/util/throttle.c b/util/throttle.c
37
diff --git a/block/io.c b/block/io.c
22
index XXXXXXX..XXXXXXX 100644
38
index XXXXXXX..XXXXXXX 100644
23
--- a/util/throttle.c
39
--- a/block/io.c
24
+++ b/util/throttle.c
40
+++ b/block/io.c
25
@@ -XXX,XX +XXX,XX @@ static bool throttle_compute_timer(ThrottleState *ts,
41
@@ -XXX,XX +XXX,XX @@ bdrv_co_common_block_status_above(BlockDriverState *bs,
26
void throttle_timers_attach_aio_context(ThrottleTimers *tt,
42
int64_t *map,
27
AioContext *new_context)
43
BlockDriverState **file)
28
{
44
{
29
- tt->timers[THROTTLE_READ] =
45
+ int ret;
30
- aio_timer_new(new_context, tt->clock_type, SCALE_NS,
46
BlockDriverState *p;
31
- tt->timer_cb[THROTTLE_READ], tt->timer_opaque);
47
- int ret = 0;
32
- tt->timers[THROTTLE_WRITE] =
48
- bool first = true;
33
- aio_timer_new(new_context, tt->clock_type, SCALE_NS,
49
+ int64_t eof = 0;
34
- tt->timer_cb[THROTTLE_WRITE], tt->timer_opaque);
50
35
+ ThrottleDirection dir;
51
assert(bs != base);
52
- for (p = bs; p != base; p = bdrv_filter_or_cow_bs(p)) {
36
+
53
+
37
+ for (dir = THROTTLE_READ; dir < THROTTLE_MAX; dir++) {
54
+ ret = bdrv_co_block_status(bs, want_zero, offset, bytes, pnum, map, file);
38
+ if (tt->timer_cb[dir]) {
55
+ if (ret < 0 || *pnum == 0 || ret & BDRV_BLOCK_ALLOCATED) {
39
+ tt->timers[dir] =
56
+ return ret;
40
+ aio_timer_new(new_context, tt->clock_type, SCALE_NS,
41
+ tt->timer_cb[dir], tt->timer_opaque);
42
+ }
43
+ }
57
+ }
58
+
59
+ if (ret & BDRV_BLOCK_EOF) {
60
+ eof = offset + *pnum;
61
+ }
62
+
63
+ assert(*pnum <= bytes);
64
+ bytes = *pnum;
65
+
66
+ for (p = bdrv_filter_or_cow_bs(bs); p != base;
67
+ p = bdrv_filter_or_cow_bs(p))
68
+ {
69
ret = bdrv_co_block_status(p, want_zero, offset, bytes, pnum, map,
70
file);
71
if (ret < 0) {
72
- break;
73
+ return ret;
74
}
75
- if (ret & BDRV_BLOCK_ZERO && ret & BDRV_BLOCK_EOF && !first) {
76
+ if (*pnum == 0) {
77
/*
78
- * Reading beyond the end of the file continues to read
79
- * zeroes, but we can only widen the result to the
80
- * unallocated length we learned from an earlier
81
- * iteration.
82
+ * The top layer deferred to this layer, and because this layer is
83
+ * short, any zeroes that we synthesize beyond EOF behave as if they
84
+ * were allocated at this layer.
85
+ *
86
+ * We don't include BDRV_BLOCK_EOF into ret, as upper layer may be
87
+ * larger. We'll add BDRV_BLOCK_EOF if needed at function end, see
88
+ * below.
89
*/
90
+ assert(ret & BDRV_BLOCK_EOF);
91
*pnum = bytes;
92
+ if (file) {
93
+ *file = p;
94
+ }
95
+ ret = BDRV_BLOCK_ZERO | BDRV_BLOCK_ALLOCATED;
96
+ break;
97
}
98
- if (ret & (BDRV_BLOCK_ZERO | BDRV_BLOCK_DATA)) {
99
+ if (ret & BDRV_BLOCK_ALLOCATED) {
100
+ /*
101
+ * We've found the node and the status, we must break.
102
+ *
103
+ * Drop BDRV_BLOCK_EOF, as it's not for upper layer, which may be
104
+ * larger. We'll add BDRV_BLOCK_EOF if needed at function end, see
105
+ * below.
106
+ */
107
+ ret &= ~BDRV_BLOCK_EOF;
108
break;
109
}
110
- /* [offset, pnum] unallocated on this layer, which could be only
111
- * the first part of [offset, bytes]. */
112
- bytes = MIN(bytes, *pnum);
113
- first = false;
114
+
115
+ /*
116
+ * OK, [offset, offset + *pnum) region is unallocated on this layer,
117
+ * let's continue the diving.
118
+ */
119
+ assert(*pnum <= bytes);
120
+ bytes = *pnum;
121
+ }
122
+
123
+ if (offset + *pnum == eof) {
124
+ ret |= BDRV_BLOCK_EOF;
125
}
126
+
127
return ret;
44
}
128
}
45
129
46
/*
130
diff --git a/block/qcow2.c b/block/qcow2.c
47
@@ -XXX,XX +XXX,XX @@ void throttle_timers_init(ThrottleTimers *tt,
131
index XXXXXXX..XXXXXXX 100644
48
QEMUTimerCB *write_timer_cb,
132
--- a/block/qcow2.c
49
void *timer_opaque)
133
+++ b/block/qcow2.c
50
{
134
@@ -XXX,XX +XXX,XX @@ static bool is_zero(BlockDriverState *bs, int64_t offset, int64_t bytes)
51
+ assert(read_timer_cb || write_timer_cb);
135
if (!bytes) {
52
memset(tt, 0, sizeof(ThrottleTimers));
53
54
tt->clock_type = clock_type;
55
@@ -XXX,XX +XXX,XX @@ void throttle_timers_init(ThrottleTimers *tt,
56
/* destroy a timer */
57
static void throttle_timer_destroy(QEMUTimer **timer)
58
{
59
- assert(*timer != NULL);
60
+ if (*timer == NULL) {
61
+ return;
62
+ }
63
64
timer_free(*timer);
65
*timer = NULL;
66
@@ -XXX,XX +XXX,XX @@ static void throttle_timer_destroy(QEMUTimer **timer)
67
/* Remove timers from event loop */
68
void throttle_timers_detach_aio_context(ThrottleTimers *tt)
69
{
70
- int i;
71
+ ThrottleDirection dir;
72
73
- for (i = 0; i < THROTTLE_MAX; i++) {
74
- throttle_timer_destroy(&tt->timers[i]);
75
+ for (dir = THROTTLE_READ; dir < THROTTLE_MAX; dir++) {
76
+ throttle_timer_destroy(&tt->timers[dir]);
77
}
78
}
79
80
@@ -XXX,XX +XXX,XX @@ void throttle_timers_destroy(ThrottleTimers *tt)
81
/* is any throttling timer configured */
82
bool throttle_timers_are_initialized(ThrottleTimers *tt)
83
{
84
- if (tt->timers[0]) {
85
- return true;
86
+ ThrottleDirection dir;
87
+
88
+ for (dir = THROTTLE_READ; dir < THROTTLE_MAX; dir++) {
89
+ if (tt->timers[dir]) {
90
+ return true;
91
+ }
92
}
93
94
return false;
95
@@ -XXX,XX +XXX,XX @@ bool throttle_schedule_timer(ThrottleState *ts,
96
{
97
int64_t now = qemu_clock_get_ns(tt->clock_type);
98
int64_t next_timestamp;
99
+ QEMUTimer *timer;
100
bool must_wait;
101
102
+ timer = is_write ? tt->timers[THROTTLE_WRITE] : tt->timers[THROTTLE_READ];
103
+ assert(timer);
104
+
105
must_wait = throttle_compute_timer(ts,
106
is_write,
107
now,
108
@@ -XXX,XX +XXX,XX @@ bool throttle_schedule_timer(ThrottleState *ts,
109
}
110
111
/* request throttled and timer pending -> do nothing */
112
- if (timer_pending(tt->timers[is_write])) {
113
+ if (timer_pending(timer)) {
114
return true;
136
return true;
115
}
137
}
116
138
- res = bdrv_block_status_above(bs, NULL, offset, bytes, &nr, NULL, NULL);
117
/* request throttled and timer not pending -> arm timer */
139
- return res >= 0 && (res & BDRV_BLOCK_ZERO) && nr == bytes;
118
- timer_mod(tt->timers[is_write], next_timestamp);
140
+
119
+ timer_mod(timer, next_timestamp);
141
+ /*
120
return true;
142
+ * bdrv_block_status_above doesn't merge different types of zeros, for
143
+ * example, zeros which come from the region which is unallocated in
144
+ * the whole backing chain, and zeros which come because of a short
145
+ * backing file. So, we need a loop.
146
+ */
147
+ do {
148
+ res = bdrv_block_status_above(bs, NULL, offset, bytes, &nr, NULL, NULL);
149
+ offset += nr;
150
+ bytes -= nr;
151
+ } while (res >= 0 && (res & BDRV_BLOCK_ZERO) && nr && bytes);
152
+
153
+ return res >= 0 && (res & BDRV_BLOCK_ZERO) && bytes == 0;
121
}
154
}
122
155
156
static coroutine_fn int qcow2_co_pwrite_zeroes(BlockDriverState *bs,
123
--
157
--
124
2.41.0
158
2.26.2
159
diff view generated by jsdifflib
1
From: zhenwei pi <pizhenwei@bytedance.com>
1
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2
2
3
Use enum ThrottleDirection instead of number index.
3
In order to reuse bdrv_common_block_status_above in
4
bdrv_is_allocated_above, let's support include_base parameter.
4
5
6
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
5
Reviewed-by: Alberto Garcia <berto@igalia.com>
7
Reviewed-by: Alberto Garcia <berto@igalia.com>
6
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
8
Reviewed-by: Eric Blake <eblake@redhat.com>
7
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
9
Message-id: 20200924194003.22080-3-vsementsov@virtuozzo.com
8
Message-Id: <20230728022006.1098509-2-pizhenwei@bytedance.com>
10
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
9
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
10
---
11
---
11
include/qemu/throttle.h | 11 ++++++++---
12
block/coroutines.h | 2 ++
12
util/throttle.c | 16 +++++++++-------
13
block/io.c | 21 ++++++++++++++-------
13
2 files changed, 17 insertions(+), 10 deletions(-)
14
2 files changed, 16 insertions(+), 7 deletions(-)
14
15
15
diff --git a/include/qemu/throttle.h b/include/qemu/throttle.h
16
diff --git a/block/coroutines.h b/block/coroutines.h
16
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
17
--- a/include/qemu/throttle.h
18
--- a/block/coroutines.h
18
+++ b/include/qemu/throttle.h
19
+++ b/block/coroutines.h
19
@@ -XXX,XX +XXX,XX @@ typedef struct ThrottleState {
20
@@ -XXX,XX +XXX,XX @@ bdrv_pwritev(BdrvChild *child, int64_t offset, unsigned int bytes,
20
int64_t previous_leak; /* timestamp of the last leak done */
21
int coroutine_fn
21
} ThrottleState;
22
bdrv_co_common_block_status_above(BlockDriverState *bs,
22
23
BlockDriverState *base,
23
+typedef enum {
24
+ bool include_base,
24
+ THROTTLE_READ = 0,
25
bool want_zero,
25
+ THROTTLE_WRITE,
26
int64_t offset,
26
+ THROTTLE_MAX
27
int64_t bytes,
27
+} ThrottleDirection;
28
@@ -XXX,XX +XXX,XX @@ bdrv_co_common_block_status_above(BlockDriverState *bs,
29
int generated_co_wrapper
30
bdrv_common_block_status_above(BlockDriverState *bs,
31
BlockDriverState *base,
32
+ bool include_base,
33
bool want_zero,
34
int64_t offset,
35
int64_t bytes,
36
diff --git a/block/io.c b/block/io.c
37
index XXXXXXX..XXXXXXX 100644
38
--- a/block/io.c
39
+++ b/block/io.c
40
@@ -XXX,XX +XXX,XX @@ early_out:
41
int coroutine_fn
42
bdrv_co_common_block_status_above(BlockDriverState *bs,
43
BlockDriverState *base,
44
+ bool include_base,
45
bool want_zero,
46
int64_t offset,
47
int64_t bytes,
48
@@ -XXX,XX +XXX,XX @@ bdrv_co_common_block_status_above(BlockDriverState *bs,
49
BlockDriverState *p;
50
int64_t eof = 0;
51
52
- assert(bs != base);
53
+ assert(include_base || bs != base);
54
+ assert(!include_base || base); /* Can't include NULL base */
55
56
ret = bdrv_co_block_status(bs, want_zero, offset, bytes, pnum, map, file);
57
- if (ret < 0 || *pnum == 0 || ret & BDRV_BLOCK_ALLOCATED) {
58
+ if (ret < 0 || *pnum == 0 || ret & BDRV_BLOCK_ALLOCATED || bs == base) {
59
return ret;
60
}
61
62
@@ -XXX,XX +XXX,XX @@ bdrv_co_common_block_status_above(BlockDriverState *bs,
63
assert(*pnum <= bytes);
64
bytes = *pnum;
65
66
- for (p = bdrv_filter_or_cow_bs(bs); p != base;
67
+ for (p = bdrv_filter_or_cow_bs(bs); include_base || p != base;
68
p = bdrv_filter_or_cow_bs(p))
69
{
70
ret = bdrv_co_block_status(p, want_zero, offset, bytes, pnum, map,
71
@@ -XXX,XX +XXX,XX @@ bdrv_co_common_block_status_above(BlockDriverState *bs,
72
break;
73
}
74
75
+ if (p == base) {
76
+ assert(include_base);
77
+ break;
78
+ }
28
+
79
+
29
typedef struct ThrottleTimers {
80
/*
30
- QEMUTimer *timers[2]; /* timers used to do the throttling */
81
* OK, [offset, offset + *pnum) region is unallocated on this layer,
31
+ QEMUTimer *timers[THROTTLE_MAX]; /* timers used to do the throttling */
82
* let's continue the diving.
32
QEMUClockType clock_type; /* the clock used */
83
@@ -XXX,XX +XXX,XX @@ int bdrv_block_status_above(BlockDriverState *bs, BlockDriverState *base,
33
84
int64_t offset, int64_t bytes, int64_t *pnum,
34
/* Callbacks */
85
int64_t *map, BlockDriverState **file)
35
- QEMUTimerCB *read_timer_cb;
36
- QEMUTimerCB *write_timer_cb;
37
+ QEMUTimerCB *timer_cb[THROTTLE_MAX];
38
void *timer_opaque;
39
} ThrottleTimers;
40
41
diff --git a/util/throttle.c b/util/throttle.c
42
index XXXXXXX..XXXXXXX 100644
43
--- a/util/throttle.c
44
+++ b/util/throttle.c
45
@@ -XXX,XX +XXX,XX @@ static bool throttle_compute_timer(ThrottleState *ts,
46
void throttle_timers_attach_aio_context(ThrottleTimers *tt,
47
AioContext *new_context)
48
{
86
{
49
- tt->timers[0] = aio_timer_new(new_context, tt->clock_type, SCALE_NS,
87
- return bdrv_common_block_status_above(bs, base, true, offset, bytes,
50
- tt->read_timer_cb, tt->timer_opaque);
88
+ return bdrv_common_block_status_above(bs, base, false, true, offset, bytes,
51
- tt->timers[1] = aio_timer_new(new_context, tt->clock_type, SCALE_NS,
89
pnum, map, file);
52
- tt->write_timer_cb, tt->timer_opaque);
53
+ tt->timers[THROTTLE_READ] =
54
+ aio_timer_new(new_context, tt->clock_type, SCALE_NS,
55
+ tt->timer_cb[THROTTLE_READ], tt->timer_opaque);
56
+ tt->timers[THROTTLE_WRITE] =
57
+ aio_timer_new(new_context, tt->clock_type, SCALE_NS,
58
+ tt->timer_cb[THROTTLE_WRITE], tt->timer_opaque);
59
}
90
}
60
91
61
/*
92
@@ -XXX,XX +XXX,XX @@ int coroutine_fn bdrv_is_allocated(BlockDriverState *bs, int64_t offset,
62
@@ -XXX,XX +XXX,XX @@ void throttle_timers_init(ThrottleTimers *tt,
93
int ret;
63
memset(tt, 0, sizeof(ThrottleTimers));
94
int64_t dummy;
64
95
65
tt->clock_type = clock_type;
96
- ret = bdrv_common_block_status_above(bs, bdrv_filter_or_cow_bs(bs), false,
66
- tt->read_timer_cb = read_timer_cb;
97
- offset, bytes, pnum ? pnum : &dummy,
67
- tt->write_timer_cb = write_timer_cb;
98
- NULL, NULL);
68
+ tt->timer_cb[THROTTLE_READ] = read_timer_cb;
99
+ ret = bdrv_common_block_status_above(bs, bs, true, false, offset,
69
+ tt->timer_cb[THROTTLE_WRITE] = write_timer_cb;
100
+ bytes, pnum ? pnum : &dummy, NULL,
70
tt->timer_opaque = timer_opaque;
101
+ NULL);
71
throttle_timers_attach_aio_context(tt, aio_context);
102
if (ret < 0) {
72
}
103
return ret;
73
@@ -XXX,XX +XXX,XX @@ void throttle_timers_detach_aio_context(ThrottleTimers *tt)
74
{
75
int i;
76
77
- for (i = 0; i < 2; i++) {
78
+ for (i = 0; i < THROTTLE_MAX; i++) {
79
throttle_timer_destroy(&tt->timers[i]);
80
}
104
}
81
}
82
--
105
--
83
2.41.0
106
2.26.2
107
diff view generated by jsdifflib
1
From: zhenwei pi <pizhenwei@bytedance.com>
1
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2
2
3
The first dimension of both to_check and
3
We are going to reuse bdrv_common_block_status_above in
4
bucket_types_size/bucket_types_units is used as throttle direction,
4
bdrv_is_allocated_above. bdrv_is_allocated_above may be called with
5
use THROTTLE_MAX instead of hard coded number. Also use ARRAY_SIZE()
5
include_base == false and still bs == base (for ex. from img_rebase()).
6
to avoid hard coded number for the second dimension.
7
6
8
Hanna noticed that the two array should be static. Yes, turn them
7
So, support this corner case.
9
into static variables.
10
8
11
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
9
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
12
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
10
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
13
Message-Id: <20230728022006.1098509-8-pizhenwei@bytedance.com>
11
Reviewed-by: Eric Blake <eblake@redhat.com>
14
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
12
Reviewed-by: Alberto Garcia <berto@igalia.com>
13
Message-id: 20200924194003.22080-4-vsementsov@virtuozzo.com
14
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
15
---
15
---
16
util/throttle.c | 11 ++++++-----
16
block/io.c | 6 +++++-
17
1 file changed, 6 insertions(+), 5 deletions(-)
17
1 file changed, 5 insertions(+), 1 deletion(-)
18
18
19
diff --git a/util/throttle.c b/util/throttle.c
19
diff --git a/block/io.c b/block/io.c
20
index XXXXXXX..XXXXXXX 100644
20
index XXXXXXX..XXXXXXX 100644
21
--- a/util/throttle.c
21
--- a/block/io.c
22
+++ b/util/throttle.c
22
+++ b/block/io.c
23
@@ -XXX,XX +XXX,XX @@ int64_t throttle_compute_wait(LeakyBucket *bkt)
23
@@ -XXX,XX +XXX,XX @@ bdrv_co_common_block_status_above(BlockDriverState *bs,
24
static int64_t throttle_compute_wait_for(ThrottleState *ts,
24
BlockDriverState *p;
25
ThrottleDirection direction)
25
int64_t eof = 0;
26
{
26
27
- BucketType to_check[2][4] = { {THROTTLE_BPS_TOTAL,
27
- assert(include_base || bs != base);
28
+ static const BucketType to_check[THROTTLE_MAX][4] = {
28
assert(!include_base || base); /* Can't include NULL base */
29
+ {THROTTLE_BPS_TOTAL,
29
30
THROTTLE_OPS_TOTAL,
30
+ if (!include_base && bs == base) {
31
THROTTLE_BPS_READ,
31
+ *pnum = bytes;
32
THROTTLE_OPS_READ},
32
+ return 0;
33
@@ -XXX,XX +XXX,XX @@ static int64_t throttle_compute_wait_for(ThrottleState *ts,
33
+ }
34
int64_t wait, max_wait = 0;
34
+
35
int i;
35
ret = bdrv_co_block_status(bs, want_zero, offset, bytes, pnum, map, file);
36
36
if (ret < 0 || *pnum == 0 || ret & BDRV_BLOCK_ALLOCATED || bs == base) {
37
- for (i = 0; i < 4; i++) {
37
return ret;
38
+ for (i = 0; i < ARRAY_SIZE(to_check[THROTTLE_READ]); i++) {
39
BucketType index = to_check[direction][i];
40
wait = throttle_compute_wait(&ts->cfg.buckets[index]);
41
if (wait > max_wait) {
42
@@ -XXX,XX +XXX,XX @@ bool throttle_schedule_timer(ThrottleState *ts,
43
void throttle_account(ThrottleState *ts, ThrottleDirection direction,
44
uint64_t size)
45
{
46
- const BucketType bucket_types_size[2][2] = {
47
+ static const BucketType bucket_types_size[THROTTLE_MAX][2] = {
48
{ THROTTLE_BPS_TOTAL, THROTTLE_BPS_READ },
49
{ THROTTLE_BPS_TOTAL, THROTTLE_BPS_WRITE }
50
};
51
- const BucketType bucket_types_units[2][2] = {
52
+ static const BucketType bucket_types_units[THROTTLE_MAX][2] = {
53
{ THROTTLE_OPS_TOTAL, THROTTLE_OPS_READ },
54
{ THROTTLE_OPS_TOTAL, THROTTLE_OPS_WRITE }
55
};
56
@@ -XXX,XX +XXX,XX @@ void throttle_account(ThrottleState *ts, ThrottleDirection direction,
57
units = (double) size / ts->cfg.op_size;
58
}
59
60
- for (i = 0; i < 2; i++) {
61
+ for (i = 0; i < ARRAY_SIZE(bucket_types_size[THROTTLE_READ]); i++) {
62
LeakyBucket *bkt;
63
64
bkt = &ts->cfg.buckets[bucket_types_size[direction][i]];
65
--
38
--
66
2.41.0
39
2.26.2
40
diff view generated by jsdifflib
1
From: zhenwei pi <pizhenwei@bytedance.com>
1
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2
2
3
Operations on a cryptodev are considered as *write* only, the callback
3
bdrv_is_allocated_above wrongly handles short backing files: it reports
4
of read direction is never invoked. Use NULL instead of an unreachable
4
after-EOF space as UNALLOCATED which is wrong, as on read the data is
5
path(cryptodev_backend_throttle_timer_cb on read direction).
5
generated on the level of short backing file (if all overlays have
6
unallocated areas at that place).
6
7
7
The dummy read timer(never invoked) is already removed here, it means
8
Reusing bdrv_common_block_status_above fixes the issue and unifies code
8
that the 'FIXME' tag is no longer needed.
9
path.
9
10
11
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
12
Reviewed-by: Eric Blake <eblake@redhat.com>
10
Reviewed-by: Alberto Garcia <berto@igalia.com>
13
Reviewed-by: Alberto Garcia <berto@igalia.com>
11
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
14
Message-id: 20200924194003.22080-5-vsementsov@virtuozzo.com
12
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
15
[Fix s/has/have/ as suggested by Eric Blake. Fix s/area/areas/.
13
Message-Id: <20230728022006.1098509-6-pizhenwei@bytedance.com>
16
--Stefan]
14
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
17
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
15
---
18
---
16
backends/cryptodev.c | 3 +--
19
block/io.c | 43 +++++--------------------------------------
17
1 file changed, 1 insertion(+), 2 deletions(-)
20
1 file changed, 5 insertions(+), 38 deletions(-)
18
21
19
diff --git a/backends/cryptodev.c b/backends/cryptodev.c
22
diff --git a/block/io.c b/block/io.c
20
index XXXXXXX..XXXXXXX 100644
23
index XXXXXXX..XXXXXXX 100644
21
--- a/backends/cryptodev.c
24
--- a/block/io.c
22
+++ b/backends/cryptodev.c
25
+++ b/block/io.c
23
@@ -XXX,XX +XXX,XX @@ static void cryptodev_backend_set_throttle(CryptoDevBackend *backend, int field,
26
@@ -XXX,XX +XXX,XX @@ int coroutine_fn bdrv_is_allocated(BlockDriverState *bs, int64_t offset,
24
if (!enabled) {
27
* at 'offset + *pnum' may return the same allocation status (in other
25
throttle_init(&backend->ts);
28
* words, the result is not necessarily the maximum possible range);
26
throttle_timers_init(&backend->tt, qemu_get_aio_context(),
29
* but 'pnum' will only be 0 when end of file is reached.
27
- QEMU_CLOCK_REALTIME,
30
- *
28
- cryptodev_backend_throttle_timer_cb, /* FIXME */
31
*/
29
+ QEMU_CLOCK_REALTIME, NULL,
32
int bdrv_is_allocated_above(BlockDriverState *top,
30
cryptodev_backend_throttle_timer_cb, backend);
33
BlockDriverState *base,
34
bool include_base, int64_t offset,
35
int64_t bytes, int64_t *pnum)
36
{
37
- BlockDriverState *intermediate;
38
- int ret;
39
- int64_t n = bytes;
40
-
41
- assert(base || !include_base);
42
-
43
- intermediate = top;
44
- while (include_base || intermediate != base) {
45
- int64_t pnum_inter;
46
- int64_t size_inter;
47
-
48
- assert(intermediate);
49
- ret = bdrv_is_allocated(intermediate, offset, bytes, &pnum_inter);
50
- if (ret < 0) {
51
- return ret;
52
- }
53
- if (ret) {
54
- *pnum = pnum_inter;
55
- return 1;
56
- }
57
-
58
- size_inter = bdrv_getlength(intermediate);
59
- if (size_inter < 0) {
60
- return size_inter;
61
- }
62
- if (n > pnum_inter &&
63
- (intermediate == top || offset + pnum_inter < size_inter)) {
64
- n = pnum_inter;
65
- }
66
-
67
- if (intermediate == base) {
68
- break;
69
- }
70
-
71
- intermediate = bdrv_filter_or_cow_bs(intermediate);
72
+ int ret = bdrv_common_block_status_above(top, base, include_base, false,
73
+ offset, bytes, pnum, NULL, NULL);
74
+ if (ret < 0) {
75
+ return ret;
31
}
76
}
32
77
78
- *pnum = n;
79
- return 0;
80
+ return !!(ret & BDRV_BLOCK_ALLOCATED);
81
}
82
83
int coroutine_fn
33
--
84
--
34
2.41.0
85
2.26.2
86
diff view generated by jsdifflib
1
From: zhenwei pi <pizhenwei@bytedance.com>
1
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2
2
3
Use enum ThrottleDirection instead in the throttle test codes.
3
These cases are fixed by previous patches around block_status and
4
is_allocated.
4
5
6
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
7
Reviewed-by: Eric Blake <eblake@redhat.com>
5
Reviewed-by: Alberto Garcia <berto@igalia.com>
8
Reviewed-by: Alberto Garcia <berto@igalia.com>
6
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
9
Message-id: 20200924194003.22080-6-vsementsov@virtuozzo.com
7
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
10
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
8
Message-Id: <20230728022006.1098509-3-pizhenwei@bytedance.com>
9
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
10
---
11
---
11
tests/unit/test-throttle.c | 6 +++---
12
tests/qemu-iotests/274 | 20 +++++++++++
12
1 file changed, 3 insertions(+), 3 deletions(-)
13
tests/qemu-iotests/274.out | 68 ++++++++++++++++++++++++++++++++++++++
14
2 files changed, 88 insertions(+)
13
15
14
diff --git a/tests/unit/test-throttle.c b/tests/unit/test-throttle.c
16
diff --git a/tests/qemu-iotests/274 b/tests/qemu-iotests/274
17
index XXXXXXX..XXXXXXX 100755
18
--- a/tests/qemu-iotests/274
19
+++ b/tests/qemu-iotests/274
20
@@ -XXX,XX +XXX,XX @@ with iotests.FilePath('base') as base, \
21
iotests.qemu_io_log('-c', 'read -P 1 0 %d' % size_short, mid)
22
iotests.qemu_io_log('-c', 'read -P 0 %d %d' % (size_short, size_diff), mid)
23
24
+ iotests.log('=== Testing qemu-img commit (top -> base) ===')
25
+
26
+ create_chain()
27
+ iotests.qemu_img_log('commit', '-b', base, top)
28
+ iotests.img_info_log(base)
29
+ iotests.qemu_io_log('-c', 'read -P 1 0 %d' % size_short, base)
30
+ iotests.qemu_io_log('-c', 'read -P 0 %d %d' % (size_short, size_diff), base)
31
+
32
+ iotests.log('=== Testing QMP active commit (top -> base) ===')
33
+
34
+ create_chain()
35
+ with create_vm() as vm:
36
+ vm.launch()
37
+ vm.qmp_log('block-commit', device='top', base_node='base',
38
+ job_id='job0', auto_dismiss=False)
39
+ vm.run_job('job0', wait=5)
40
+
41
+ iotests.img_info_log(mid)
42
+ iotests.qemu_io_log('-c', 'read -P 1 0 %d' % size_short, base)
43
+ iotests.qemu_io_log('-c', 'read -P 0 %d %d' % (size_short, size_diff), base)
44
45
iotests.log('== Resize tests ==')
46
47
diff --git a/tests/qemu-iotests/274.out b/tests/qemu-iotests/274.out
15
index XXXXXXX..XXXXXXX 100644
48
index XXXXXXX..XXXXXXX 100644
16
--- a/tests/unit/test-throttle.c
49
--- a/tests/qemu-iotests/274.out
17
+++ b/tests/unit/test-throttle.c
50
+++ b/tests/qemu-iotests/274.out
18
@@ -XXX,XX +XXX,XX @@ static void test_init(void)
51
@@ -XXX,XX +XXX,XX @@ read 1048576/1048576 bytes at offset 0
19
52
read 1048576/1048576 bytes at offset 1048576
20
/* check initialized fields */
53
1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
21
g_assert(tt->clock_type == QEMU_CLOCK_VIRTUAL);
54
22
- g_assert(tt->timers[0]);
55
+=== Testing qemu-img commit (top -> base) ===
23
- g_assert(tt->timers[1]);
56
+Formatting 'TEST_DIR/PID-base', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=2097152 lazy_refcounts=off refcount_bits=16
24
+ g_assert(tt->timers[THROTTLE_READ]);
57
+
25
+ g_assert(tt->timers[THROTTLE_WRITE]);
58
+Formatting 'TEST_DIR/PID-mid', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=1048576 backing_file=TEST_DIR/PID-base backing_fmt=qcow2 lazy_refcounts=off refcount_bits=16
26
59
+
27
/* check other fields where cleared */
60
+Formatting 'TEST_DIR/PID-top', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=2097152 backing_file=TEST_DIR/PID-mid backing_fmt=qcow2 lazy_refcounts=off refcount_bits=16
28
g_assert(!ts.previous_leak);
61
+
29
@@ -XXX,XX +XXX,XX @@ static void test_destroy(void)
62
+wrote 2097152/2097152 bytes at offset 0
30
throttle_timers_init(tt, ctx, QEMU_CLOCK_VIRTUAL,
63
+2 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
31
read_timer_cb, write_timer_cb, &ts);
64
+
32
throttle_timers_destroy(tt);
65
+Image committed.
33
- for (i = 0; i < 2; i++) {
66
+
34
+ for (i = 0; i < THROTTLE_MAX; i++) {
67
+image: TEST_IMG
35
g_assert(!tt->timers[i]);
68
+file format: IMGFMT
36
}
69
+virtual size: 2 MiB (2097152 bytes)
37
}
70
+cluster_size: 65536
71
+Format specific information:
72
+ compat: 1.1
73
+ compression type: zlib
74
+ lazy refcounts: false
75
+ refcount bits: 16
76
+ corrupt: false
77
+ extended l2: false
78
+
79
+read 1048576/1048576 bytes at offset 0
80
+1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
81
+
82
+read 1048576/1048576 bytes at offset 1048576
83
+1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
84
+
85
+=== Testing QMP active commit (top -> base) ===
86
+Formatting 'TEST_DIR/PID-base', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=2097152 lazy_refcounts=off refcount_bits=16
87
+
88
+Formatting 'TEST_DIR/PID-mid', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=1048576 backing_file=TEST_DIR/PID-base backing_fmt=qcow2 lazy_refcounts=off refcount_bits=16
89
+
90
+Formatting 'TEST_DIR/PID-top', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=2097152 backing_file=TEST_DIR/PID-mid backing_fmt=qcow2 lazy_refcounts=off refcount_bits=16
91
+
92
+wrote 2097152/2097152 bytes at offset 0
93
+2 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
94
+
95
+{"execute": "block-commit", "arguments": {"auto-dismiss": false, "base-node": "base", "device": "top", "job-id": "job0"}}
96
+{"return": {}}
97
+{"execute": "job-complete", "arguments": {"id": "job0"}}
98
+{"return": {}}
99
+{"data": {"device": "job0", "len": 1048576, "offset": 1048576, "speed": 0, "type": "commit"}, "event": "BLOCK_JOB_READY", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
100
+{"data": {"device": "job0", "len": 1048576, "offset": 1048576, "speed": 0, "type": "commit"}, "event": "BLOCK_JOB_COMPLETED", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
101
+{"execute": "job-dismiss", "arguments": {"id": "job0"}}
102
+{"return": {}}
103
+image: TEST_IMG
104
+file format: IMGFMT
105
+virtual size: 1 MiB (1048576 bytes)
106
+cluster_size: 65536
107
+backing file: TEST_DIR/PID-base
108
+backing file format: IMGFMT
109
+Format specific information:
110
+ compat: 1.1
111
+ compression type: zlib
112
+ lazy refcounts: false
113
+ refcount bits: 16
114
+ corrupt: false
115
+ extended l2: false
116
+
117
+read 1048576/1048576 bytes at offset 0
118
+1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
119
+
120
+read 1048576/1048576 bytes at offset 1048576
121
+1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
122
+
123
== Resize tests ==
124
=== preallocation=off ===
125
Formatting 'TEST_DIR/PID-base', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=6442450944 lazy_refcounts=off refcount_bits=16
38
--
126
--
39
2.41.0
127
2.26.2
128
diff view generated by jsdifflib