1
The following changes since commit 8e6c70b9d4a1b1f3011805947925cfdb31642f7f:
1
The following changes since commit ac793156f650ae2d77834932d72224175ee69086:
2
2
3
Merge tag 'kraxel-20220614-pull-request' of git://git.kraxel.org/qemu into staging (2022-06-14 06:21:46 -0700)
3
Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20201020-1' into staging (2020-10-20 21:11:35 +0100)
4
4
5
are available in the Git repository at:
5
are available in the Git repository at:
6
6
7
https://gitlab.com/stefanha/qemu.git tags/block-pull-request
7
https://gitlab.com/stefanha/qemu.git tags/block-pull-request
8
8
9
for you to fetch changes up to 99b969fbe105117f5af6060d3afef40ca39cc9c1:
9
for you to fetch changes up to 32a3fd65e7e3551337fd26bfc0e2f899d70c028c:
10
10
11
linux-aio: explain why max batch is checked in laio_io_unplug() (2022-06-15 16:43:42 +0100)
11
iotests: add commit top->base cases to 274 (2020-10-22 09:55:39 +0100)
12
12
13
----------------------------------------------------------------
13
----------------------------------------------------------------
14
Pull request
14
Pull request
15
15
16
This pull request includes an important aio=native I/O stall fix, the
16
v2:
17
experimental vifo-user server, the io_uring_register_ring_fd() optimization for
17
* Fix format string issues on 32-bit hosts [Peter]
18
aio=io_uring, and an update to Vladimir Sementsov-Ogievskiy's maintainership
18
* Fix qemu-nbd.c CONFIG_POSIX ifdef issue [Eric]
19
details.
19
* Fix missing eventfd.h header on macOS [Peter]
20
* Drop unreliable vhost-user-blk test (will send a new patch when ready) [Peter]
21
22
This pull request contains the vhost-user-blk server by Coiby Xu along with my
23
additions, block/nvme.c alignment and hardware error statistics by Philippe
24
Mathieu-Daudé, and bdrv_co_block_status_above() fixes by Vladimir
25
Sementsov-Ogievskiy.
20
26
21
----------------------------------------------------------------
27
----------------------------------------------------------------
22
28
23
Jagannathan Raman (14):
29
Coiby Xu (6):
24
qdev: unplug blocker for devices
30
libvhost-user: Allow vu_message_read to be replaced
25
remote/machine: add HotplugHandler for remote machine
31
libvhost-user: remove watch for kick_fd when de-initialize vu-dev
26
remote/machine: add vfio-user property
32
util/vhost-user-server: generic vhost user server
27
vfio-user: build library
33
block: move logical block size check function to a common utility
28
vfio-user: define vfio-user-server object
34
function
29
vfio-user: instantiate vfio-user context
35
block/export: vhost-user block device backend server
30
vfio-user: find and init PCI device
36
MAINTAINERS: Add vhost-user block device backend server maintainer
31
vfio-user: run vfio-user context
32
vfio-user: handle PCI config space accesses
33
vfio-user: IOMMU support for remote device
34
vfio-user: handle DMA mappings
35
vfio-user: handle PCI BAR accesses
36
vfio-user: handle device interrupts
37
vfio-user: handle reset of remote device
38
37
39
Sam Li (1):
38
Philippe Mathieu-Daudé (1):
40
Use io_uring_register_ring_fd() to skip fd operations
39
block/nvme: Add driver statistics for access alignment and hw errors
41
40
42
Stefan Hajnoczi (2):
41
Stefan Hajnoczi (16):
43
linux-aio: fix unbalanced plugged counter in laio_io_unplug()
42
util/vhost-user-server: s/fileds/fields/ typo fix
44
linux-aio: explain why max batch is checked in laio_io_unplug()
43
util/vhost-user-server: drop unnecessary QOM cast
44
util/vhost-user-server: drop unnecessary watch deletion
45
block/export: consolidate request structs into VuBlockReq
46
util/vhost-user-server: drop unused DevicePanicNotifier
47
util/vhost-user-server: fix memory leak in vu_message_read()
48
util/vhost-user-server: check EOF when reading payload
49
util/vhost-user-server: rework vu_client_trip() coroutine lifecycle
50
block/export: report flush errors
51
block/export: convert vhost-user-blk server to block export API
52
util/vhost-user-server: move header to include/
53
util/vhost-user-server: use static library in meson.build
54
qemu-storage-daemon: avoid compiling blockdev_ss twice
55
block: move block exports to libblockdev
56
block/export: add iothread and fixed-iothread options
57
block/export: add vhost-user-blk multi-queue support
45
58
46
Vladimir Sementsov-Ogievskiy (1):
59
Vladimir Sementsov-Ogievskiy (5):
47
MAINTAINERS: update Vladimir's address and repositories
60
block/io: fix bdrv_co_block_status_above
61
block/io: bdrv_common_block_status_above: support include_base
62
block/io: bdrv_common_block_status_above: support bs == base
63
block/io: fix bdrv_is_allocated_above
64
iotests: add commit top->base cases to 274
48
65
49
MAINTAINERS | 27 +-
66
MAINTAINERS | 9 +
50
meson_options.txt | 2 +
67
qapi/block-core.json | 24 +-
51
qapi/misc.json | 31 +
68
qapi/block-export.json | 36 +-
52
qapi/qom.json | 20 +-
69
block/coroutines.h | 2 +
53
configure | 17 +
70
block/export/vhost-user-blk-server.h | 19 +
54
meson.build | 24 +-
71
contrib/libvhost-user/libvhost-user.h | 21 +
55
include/exec/memory.h | 3 +
72
include/qemu/vhost-user-server.h | 65 +++
56
include/hw/pci/msi.h | 1 +
73
util/block-helpers.h | 19 +
57
include/hw/pci/msix.h | 1 +
74
block/export/export.c | 37 +-
58
include/hw/pci/pci.h | 13 +
75
block/export/vhost-user-blk-server.c | 431 ++++++++++++++++++++
59
include/hw/qdev-core.h | 29 +
76
block/io.c | 132 +++---
60
include/hw/remote/iommu.h | 40 +
77
block/nvme.c | 27 ++
61
include/hw/remote/machine.h | 4 +
78
block/qcow2.c | 16 +-
62
include/hw/remote/vfio-user-obj.h | 6 +
79
contrib/libvhost-user/libvhost-user-glib.c | 2 +-
63
block/io_uring.c | 12 +-
80
contrib/libvhost-user/libvhost-user.c | 15 +-
64
block/linux-aio.c | 10 +-
81
hw/core/qdev-properties-system.c | 31 +-
65
hw/core/qdev.c | 24 +
82
nbd/server.c | 2 -
66
hw/pci/msi.c | 49 +-
83
qemu-nbd.c | 21 +-
67
hw/pci/msix.c | 35 +-
84
softmmu/vl.c | 4 +
68
hw/pci/pci.c | 13 +
85
stubs/blk-exp-close-all.c | 7 +
69
hw/remote/iommu.c | 131 ++++
86
tests/vhost-user-bridge.c | 2 +
70
hw/remote/machine.c | 88 ++-
87
tools/virtiofsd/fuse_virtio.c | 4 +-
71
hw/remote/vfio-user-obj.c | 958 ++++++++++++++++++++++++
88
util/block-helpers.c | 46 +++
72
softmmu/physmem.c | 4 +-
89
util/vhost-user-server.c | 446 +++++++++++++++++++++
73
softmmu/qdev-monitor.c | 4 +
90
block/export/meson.build | 3 +-
74
stubs/vfio-user-obj.c | 6 +
91
contrib/libvhost-user/meson.build | 1 +
75
tests/qtest/fuzz/generic_fuzz.c | 9 +-
92
meson.build | 22 +-
76
.gitlab-ci.d/buildtest.yml | 1 +
93
nbd/meson.build | 2 +
77
.gitmodules | 3 +
94
storage-daemon/meson.build | 3 +-
78
Kconfig.host | 4 +
95
stubs/meson.build | 1 +
79
hw/remote/Kconfig | 4 +
96
tests/qemu-iotests/274 | 20 +
80
hw/remote/meson.build | 4 +
97
tests/qemu-iotests/274.out | 68 ++++
81
hw/remote/trace-events | 11 +
98
util/meson.build | 4 +
82
scripts/meson-buildoptions.sh | 4 +
99
33 files changed, 1420 insertions(+), 122 deletions(-)
83
stubs/meson.build | 1 +
100
create mode 100644 block/export/vhost-user-blk-server.h
84
subprojects/libvfio-user | 1 +
101
create mode 100644 include/qemu/vhost-user-server.h
85
tests/docker/dockerfiles/centos8.docker | 2 +
102
create mode 100644 util/block-helpers.h
86
37 files changed, 1565 insertions(+), 31 deletions(-)
103
create mode 100644 block/export/vhost-user-blk-server.c
87
create mode 100644 include/hw/remote/iommu.h
104
create mode 100644 stubs/blk-exp-close-all.c
88
create mode 100644 include/hw/remote/vfio-user-obj.h
105
create mode 100644 util/block-helpers.c
89
create mode 100644 hw/remote/iommu.c
106
create mode 100644 util/vhost-user-server.c
90
create mode 100644 hw/remote/vfio-user-obj.c
91
create mode 100644 stubs/vfio-user-obj.c
92
create mode 160000 subprojects/libvfio-user
93
107
94
--
108
--
95
2.36.1
109
2.26.2
110
diff view generated by jsdifflib
1
From: Jagannathan Raman <jag.raman@oracle.com>
1
From: Philippe Mathieu-Daudé <philmd@redhat.com>
2
2
3
Add vfio-user to x-remote machine. It is a boolean, which indicates if
3
Keep statistics of some hardware errors, and number of
4
the machine supports vfio-user protocol. The machine configures the bus
4
aligned/unaligned I/O accesses.
5
differently vfio-user and multiprocess protocols, so this property
6
informs it on how to configure the bus.
7
5
8
This property should be short lived. Once vfio-user fully replaces
6
QMP example booting a full RHEL 8.3 aarch64 guest:
9
multiprocess, this property could be removed.
10
7
11
Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
8
{ "execute": "query-blockstats" }
12
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
9
{
13
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
10
"return": [
14
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
11
{
15
Message-id: 5d51a152a419cbda35d070b8e49b772b60a7230a.1655151679.git.jag.raman@oracle.com
12
"device": "",
13
"node-name": "drive0",
14
"stats": {
15
"flush_total_time_ns": 6026948,
16
"wr_highest_offset": 3383991230464,
17
"wr_total_time_ns": 807450995,
18
"failed_wr_operations": 0,
19
"failed_rd_operations": 0,
20
"wr_merged": 3,
21
"wr_bytes": 50133504,
22
"failed_unmap_operations": 0,
23
"failed_flush_operations": 0,
24
"account_invalid": false,
25
"rd_total_time_ns": 1846979900,
26
"flush_operations": 130,
27
"wr_operations": 659,
28
"rd_merged": 1192,
29
"rd_bytes": 218244096,
30
"account_failed": false,
31
"idle_time_ns": 2678641497,
32
"rd_operations": 7406,
33
},
34
"driver-specific": {
35
"driver": "nvme",
36
"completion-errors": 0,
37
"unaligned-accesses": 2959,
38
"aligned-accesses": 4477
39
},
40
"qdev": "/machine/peripheral-anon/device[0]/virtio-backend"
41
}
42
]
43
}
44
45
Suggested-by: Stefan Hajnoczi <stefanha@gmail.com>
46
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
47
Acked-by: Markus Armbruster <armbru@redhat.com>
48
Message-id: 20201001162939.1567915-1-philmd@redhat.com
16
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
49
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
17
---
50
---
18
include/hw/remote/machine.h | 2 ++
51
qapi/block-core.json | 24 +++++++++++++++++++++++-
19
hw/remote/machine.c | 23 +++++++++++++++++++++++
52
block/nvme.c | 27 +++++++++++++++++++++++++++
20
2 files changed, 25 insertions(+)
53
2 files changed, 50 insertions(+), 1 deletion(-)
21
54
22
diff --git a/include/hw/remote/machine.h b/include/hw/remote/machine.h
55
diff --git a/qapi/block-core.json b/qapi/block-core.json
23
index XXXXXXX..XXXXXXX 100644
56
index XXXXXXX..XXXXXXX 100644
24
--- a/include/hw/remote/machine.h
57
--- a/qapi/block-core.json
25
+++ b/include/hw/remote/machine.h
58
+++ b/qapi/block-core.json
26
@@ -XXX,XX +XXX,XX @@ struct RemoteMachineState {
59
@@ -XXX,XX +XXX,XX @@
27
60
'discard-nb-failed': 'uint64',
28
RemotePCIHost *host;
61
'discard-bytes-ok': 'uint64' } }
29
RemoteIOHubState iohub;
62
63
+##
64
+# @BlockStatsSpecificNvme:
65
+#
66
+# NVMe driver statistics
67
+#
68
+# @completion-errors: The number of completion errors.
69
+#
70
+# @aligned-accesses: The number of aligned accesses performed by
71
+# the driver.
72
+#
73
+# @unaligned-accesses: The number of unaligned accesses performed by
74
+# the driver.
75
+#
76
+# Since: 5.2
77
+##
78
+{ 'struct': 'BlockStatsSpecificNvme',
79
+ 'data': {
80
+ 'completion-errors': 'uint64',
81
+ 'aligned-accesses': 'uint64',
82
+ 'unaligned-accesses': 'uint64' } }
30
+
83
+
31
+ bool vfio_user;
84
##
85
# @BlockStatsSpecific:
86
#
87
@@ -XXX,XX +XXX,XX @@
88
'discriminator': 'driver',
89
'data': {
90
'file': 'BlockStatsSpecificFile',
91
- 'host_device': 'BlockStatsSpecificFile' } }
92
+ 'host_device': 'BlockStatsSpecificFile',
93
+ 'nvme': 'BlockStatsSpecificNvme' } }
94
95
##
96
# @BlockStats:
97
diff --git a/block/nvme.c b/block/nvme.c
98
index XXXXXXX..XXXXXXX 100644
99
--- a/block/nvme.c
100
+++ b/block/nvme.c
101
@@ -XXX,XX +XXX,XX @@ struct BDRVNVMeState {
102
103
/* PCI address (required for nvme_refresh_filename()) */
104
char *device;
105
+
106
+ struct {
107
+ uint64_t completion_errors;
108
+ uint64_t aligned_accesses;
109
+ uint64_t unaligned_accesses;
110
+ } stats;
32
};
111
};
33
112
34
/* Used to pass to co-routine device and ioc. */
113
#define NVME_BLOCK_OPT_DEVICE "device"
35
diff --git a/hw/remote/machine.c b/hw/remote/machine.c
114
@@ -XXX,XX +XXX,XX @@ static bool nvme_process_completion(NVMeQueuePair *q)
36
index XXXXXXX..XXXXXXX 100644
115
break;
37
--- a/hw/remote/machine.c
116
}
38
+++ b/hw/remote/machine.c
117
ret = nvme_translate_error(c);
39
@@ -XXX,XX +XXX,XX @@ static void remote_machine_init(MachineState *machine)
118
+ if (ret) {
40
qbus_set_hotplug_handler(BUS(pci_host->bus), OBJECT(s));
119
+ s->stats.completion_errors++;
120
+ }
121
q->cq.head = (q->cq.head + 1) % NVME_QUEUE_SIZE;
122
if (!q->cq.head) {
123
q->cq_phase = !q->cq_phase;
124
@@ -XXX,XX +XXX,XX @@ static int nvme_co_prw(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
125
assert(QEMU_IS_ALIGNED(bytes, s->page_size));
126
assert(bytes <= s->max_transfer);
127
if (nvme_qiov_aligned(bs, qiov)) {
128
+ s->stats.aligned_accesses++;
129
return nvme_co_prw_aligned(bs, offset, bytes, qiov, is_write, flags);
130
}
131
+ s->stats.unaligned_accesses++;
132
trace_nvme_prw_buffered(s, offset, bytes, qiov->niov, is_write);
133
buf = qemu_try_memalign(s->page_size, bytes);
134
135
@@ -XXX,XX +XXX,XX @@ static void nvme_unregister_buf(BlockDriverState *bs, void *host)
136
qemu_vfio_dma_unmap(s->vfio, host);
41
}
137
}
42
138
43
+static bool remote_machine_get_vfio_user(Object *obj, Error **errp)
139
+static BlockStatsSpecific *nvme_get_specific_stats(BlockDriverState *bs)
44
+{
140
+{
45
+ RemoteMachineState *s = REMOTE_MACHINE(obj);
141
+ BlockStatsSpecific *stats = g_new(BlockStatsSpecific, 1);
142
+ BDRVNVMeState *s = bs->opaque;
46
+
143
+
47
+ return s->vfio_user;
144
+ stats->driver = BLOCKDEV_DRIVER_NVME;
145
+ stats->u.nvme = (BlockStatsSpecificNvme) {
146
+ .completion_errors = s->stats.completion_errors,
147
+ .aligned_accesses = s->stats.aligned_accesses,
148
+ .unaligned_accesses = s->stats.unaligned_accesses,
149
+ };
150
+
151
+ return stats;
48
+}
152
+}
49
+
153
+
50
+static void remote_machine_set_vfio_user(Object *obj, bool value, Error **errp)
154
static const char *const nvme_strong_runtime_opts[] = {
51
+{
155
NVME_BLOCK_OPT_DEVICE,
52
+ RemoteMachineState *s = REMOTE_MACHINE(obj);
156
NVME_BLOCK_OPT_NAMESPACE,
53
+
157
@@ -XXX,XX +XXX,XX @@ static BlockDriver bdrv_nvme = {
54
+ if (phase_check(PHASE_MACHINE_CREATED)) {
158
.bdrv_refresh_filename = nvme_refresh_filename,
55
+ error_setg(errp, "Error enabling vfio-user - machine already created");
159
.bdrv_refresh_limits = nvme_refresh_limits,
56
+ return;
160
.strong_runtime_opts = nvme_strong_runtime_opts,
57
+ }
161
+ .bdrv_get_specific_stats = nvme_get_specific_stats,
58
+
162
59
+ s->vfio_user = value;
163
.bdrv_detach_aio_context = nvme_detach_aio_context,
60
+}
164
.bdrv_attach_aio_context = nvme_attach_aio_context,
61
+
62
static void remote_machine_class_init(ObjectClass *oc, void *data)
63
{
64
MachineClass *mc = MACHINE_CLASS(oc);
65
@@ -XXX,XX +XXX,XX @@ static void remote_machine_class_init(ObjectClass *oc, void *data)
66
mc->desc = "Experimental remote machine";
67
68
hc->unplug = qdev_simple_device_unplug_cb;
69
+
70
+ object_class_property_add_bool(oc, "vfio-user",
71
+ remote_machine_get_vfio_user,
72
+ remote_machine_set_vfio_user);
73
}
74
75
static const TypeInfo remote_machine = {
76
--
165
--
77
2.36.1
166
2.26.2
167
diff view generated by jsdifflib
1
From: Jagannathan Raman <jag.raman@oracle.com>
1
From: Coiby Xu <coiby.xu@gmail.com>
2
2
3
Setup a handler to run vfio-user context. The context is driven by
3
Allow vu_message_read to be replaced by one which will make use of the
4
messages to the file descriptor associated with it - get the fd for
4
QIOChannel functions. Thus reading vhost-user message won't stall the
5
the context and hook up the handler with it
5
guest. For slave channel, we still use the default vu_message_read.
6
6
7
Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
7
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
8
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
8
Signed-off-by: Coiby Xu <coiby.xu@gmail.com>
9
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
10
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
9
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
11
Message-id: e934b0090529d448b6a7972b21dfc3d7421ce494.1655151679.git.jag.raman@oracle.com
10
Message-id: 20200918080912.321299-2-coiby.xu@gmail.com
12
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
11
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
13
---
12
---
14
qapi/misc.json | 31 ++++++++++
13
contrib/libvhost-user/libvhost-user.h | 21 +++++++++++++++++++++
15
hw/remote/vfio-user-obj.c | 118 +++++++++++++++++++++++++++++++++++++-
14
contrib/libvhost-user/libvhost-user-glib.c | 2 +-
16
2 files changed, 148 insertions(+), 1 deletion(-)
15
contrib/libvhost-user/libvhost-user.c | 14 +++++++-------
16
tests/vhost-user-bridge.c | 2 ++
17
tools/virtiofsd/fuse_virtio.c | 4 ++--
18
5 files changed, 33 insertions(+), 10 deletions(-)
17
19
18
diff --git a/qapi/misc.json b/qapi/misc.json
20
diff --git a/contrib/libvhost-user/libvhost-user.h b/contrib/libvhost-user/libvhost-user.h
19
index XXXXXXX..XXXXXXX 100644
21
index XXXXXXX..XXXXXXX 100644
20
--- a/qapi/misc.json
22
--- a/contrib/libvhost-user/libvhost-user.h
21
+++ b/qapi/misc.json
23
+++ b/contrib/libvhost-user/libvhost-user.h
22
@@ -XXX,XX +XXX,XX @@
24
@@ -XXX,XX +XXX,XX @@
23
##
25
*/
24
{ 'event': 'RTC_CHANGE',
26
#define VHOST_USER_MAX_RAM_SLOTS 32
25
'data': { 'offset': 'int', 'qom-path': 'str' } }
27
28
+#define VHOST_USER_HDR_SIZE offsetof(VhostUserMsg, payload.u64)
26
+
29
+
27
+##
30
typedef enum VhostSetConfigType {
28
+# @VFU_CLIENT_HANGUP:
31
VHOST_SET_CONFIG_TYPE_MASTER = 0,
29
+#
32
VHOST_SET_CONFIG_TYPE_MIGRATION = 1,
30
+# Emitted when the client of a TYPE_VFIO_USER_SERVER closes the
33
@@ -XXX,XX +XXX,XX @@ typedef uint64_t (*vu_get_features_cb) (VuDev *dev);
31
+# communication channel
34
typedef void (*vu_set_features_cb) (VuDev *dev, uint64_t features);
32
+#
35
typedef int (*vu_process_msg_cb) (VuDev *dev, VhostUserMsg *vmsg,
33
+# @vfu-id: ID of the TYPE_VFIO_USER_SERVER object. It is the last component
36
int *do_reply);
34
+# of @vfu-qom-path referenced below
37
+typedef bool (*vu_read_msg_cb) (VuDev *dev, int sock, VhostUserMsg *vmsg);
35
+#
38
typedef void (*vu_queue_set_started_cb) (VuDev *dev, int qidx, bool started);
36
+# @vfu-qom-path: path to the TYPE_VFIO_USER_SERVER object in the QOM tree
39
typedef bool (*vu_queue_is_processed_in_order_cb) (VuDev *dev, int qidx);
37
+#
40
typedef int (*vu_get_config_cb) (VuDev *dev, uint8_t *config, uint32_t len);
38
+# @dev-id: ID of attached PCI device
41
@@ -XXX,XX +XXX,XX @@ struct VuDev {
39
+#
42
bool broken;
40
+# @dev-qom-path: path to attached PCI device in the QOM tree
43
uint16_t max_queues;
41
+#
44
42
+# Since: 7.1
45
+ /* @read_msg: custom method to read vhost-user message
43
+#
46
+ *
44
+# Example:
47
+ * Read data from vhost_user socket fd and fill up
45
+#
48
+ * the passed VhostUserMsg *vmsg struct.
46
+# <- { "event": "VFU_CLIENT_HANGUP",
49
+ *
47
+# "data": { "vfu-id": "vfu1",
50
+ * If reading fails, it should close the received set of file
48
+# "vfu-qom-path": "/objects/vfu1",
51
+ * descriptors as socket message's auxiliary data.
49
+# "dev-id": "sas1",
52
+ *
50
+# "dev-qom-path": "/machine/peripheral/sas1" },
53
+ * For the details, please refer to vu_message_read in libvhost-user.c
51
+# "timestamp": { "seconds": 1265044230, "microseconds": 450486 } }
54
+ * which will be used by default if not custom method is provided when
52
+#
55
+ * calling vu_init
53
+##
56
+ *
54
+{ 'event': 'VFU_CLIENT_HANGUP',
57
+ * Returns: true if vhost-user message successfully received,
55
+ 'data': { 'vfu-id': 'str', 'vfu-qom-path': 'str',
58
+ * otherwise return false.
56
+ 'dev-id': 'str', 'dev-qom-path': 'str' } }
59
+ *
57
diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c
60
+ */
61
+ vu_read_msg_cb read_msg;
62
/* @set_watch: add or update the given fd to the watch set,
63
* call cb when condition is met */
64
vu_set_watch_cb set_watch;
65
@@ -XXX,XX +XXX,XX @@ bool vu_init(VuDev *dev,
66
uint16_t max_queues,
67
int socket,
68
vu_panic_cb panic,
69
+ vu_read_msg_cb read_msg,
70
vu_set_watch_cb set_watch,
71
vu_remove_watch_cb remove_watch,
72
const VuDevIface *iface);
73
diff --git a/contrib/libvhost-user/libvhost-user-glib.c b/contrib/libvhost-user/libvhost-user-glib.c
58
index XXXXXXX..XXXXXXX 100644
74
index XXXXXXX..XXXXXXX 100644
59
--- a/hw/remote/vfio-user-obj.c
75
--- a/contrib/libvhost-user/libvhost-user-glib.c
60
+++ b/hw/remote/vfio-user-obj.c
76
+++ b/contrib/libvhost-user/libvhost-user-glib.c
77
@@ -XXX,XX +XXX,XX @@ vug_init(VugDev *dev, uint16_t max_queues, int socket,
78
g_assert(dev);
79
g_assert(iface);
80
81
- if (!vu_init(&dev->parent, max_queues, socket, panic, set_watch,
82
+ if (!vu_init(&dev->parent, max_queues, socket, panic, NULL, set_watch,
83
remove_watch, iface)) {
84
return false;
85
}
86
diff --git a/contrib/libvhost-user/libvhost-user.c b/contrib/libvhost-user/libvhost-user.c
87
index XXXXXXX..XXXXXXX 100644
88
--- a/contrib/libvhost-user/libvhost-user.c
89
+++ b/contrib/libvhost-user/libvhost-user.c
61
@@ -XXX,XX +XXX,XX @@
90
@@ -XXX,XX +XXX,XX @@
62
*
91
/* The version of inflight buffer */
63
* device - id of a device on the server, a required option. PCI devices
92
#define INFLIGHT_VERSION 1
64
* alone are supported presently.
93
65
+ *
94
-#define VHOST_USER_HDR_SIZE offsetof(VhostUserMsg, payload.u64)
66
+ * notes - x-vfio-user-server could block IO and monitor during the
95
-
67
+ * initialization phase.
96
/* The version of the protocol we support */
68
*/
97
#define VHOST_USER_VERSION 1
69
98
#define LIBVHOST_USER_DEBUG 0
70
#include "qemu/osdep.h"
99
@@ -XXX,XX +XXX,XX @@ have_userfault(void)
71
@@ -XXX,XX +XXX,XX @@
72
#include "hw/remote/machine.h"
73
#include "qapi/error.h"
74
#include "qapi/qapi-visit-sockets.h"
75
+#include "qapi/qapi-events-misc.h"
76
#include "qemu/notify.h"
77
+#include "qemu/thread.h"
78
#include "sysemu/sysemu.h"
79
#include "libvfio-user.h"
80
#include "hw/qdev-core.h"
81
#include "hw/pci/pci.h"
82
+#include "qemu/timer.h"
83
84
#define TYPE_VFU_OBJECT "x-vfio-user-server"
85
OBJECT_DECLARE_TYPE(VfuObject, VfuObjectClass, VFU_OBJECT)
86
@@ -XXX,XX +XXX,XX @@ struct VfuObject {
87
PCIDevice *pci_dev;
88
89
Error *unplug_blocker;
90
+
91
+ int vfu_poll_fd;
92
};
93
94
static void vfu_object_init_ctx(VfuObject *o, Error **errp);
95
@@ -XXX,XX +XXX,XX @@ static void vfu_object_set_device(Object *obj, const char *str, Error **errp)
96
vfu_object_init_ctx(o, errp);
97
}
100
}
98
101
99
+static void vfu_object_ctx_run(void *opaque)
102
static bool
100
+{
103
-vu_message_read(VuDev *dev, int conn_fd, VhostUserMsg *vmsg)
101
+ VfuObject *o = opaque;
104
+vu_message_read_default(VuDev *dev, int conn_fd, VhostUserMsg *vmsg)
102
+ const char *vfu_id;
105
{
103
+ char *vfu_path, *pci_dev_path;
106
char control[CMSG_SPACE(VHOST_MEMORY_BASELINE_NREGIONS * sizeof(int))] = {};
104
+ int ret = -1;
107
struct iovec iov = {
105
+
108
@@ -XXX,XX +XXX,XX @@ vu_process_message_reply(VuDev *dev, const VhostUserMsg *vmsg)
106
+ while (ret != 0) {
109
goto out;
107
+ ret = vfu_run_ctx(o->vfu_ctx);
108
+ if (ret < 0) {
109
+ if (errno == EINTR) {
110
+ continue;
111
+ } else if (errno == ENOTCONN) {
112
+ vfu_id = object_get_canonical_path_component(OBJECT(o));
113
+ vfu_path = object_get_canonical_path(OBJECT(o));
114
+ g_assert(o->pci_dev);
115
+ pci_dev_path = object_get_canonical_path(OBJECT(o->pci_dev));
116
+ /* o->device is a required property and is non-NULL here */
117
+ g_assert(o->device);
118
+ qapi_event_send_vfu_client_hangup(vfu_id, vfu_path,
119
+ o->device, pci_dev_path);
120
+ qemu_set_fd_handler(o->vfu_poll_fd, NULL, NULL, NULL);
121
+ o->vfu_poll_fd = -1;
122
+ object_unparent(OBJECT(o));
123
+ g_free(vfu_path);
124
+ g_free(pci_dev_path);
125
+ break;
126
+ } else {
127
+ VFU_OBJECT_ERROR(o, "vfu: Failed to run device %s - %s",
128
+ o->device, strerror(errno));
129
+ break;
130
+ }
131
+ }
132
+ }
133
+}
134
+
135
+static void vfu_object_attach_ctx(void *opaque)
136
+{
137
+ VfuObject *o = opaque;
138
+ GPollFD pfds[1];
139
+ int ret;
140
+
141
+ qemu_set_fd_handler(o->vfu_poll_fd, NULL, NULL, NULL);
142
+
143
+ pfds[0].fd = o->vfu_poll_fd;
144
+ pfds[0].events = G_IO_IN | G_IO_HUP | G_IO_ERR;
145
+
146
+retry_attach:
147
+ ret = vfu_attach_ctx(o->vfu_ctx);
148
+ if (ret < 0 && (errno == EAGAIN || errno == EWOULDBLOCK)) {
149
+ /**
150
+ * vfu_object_attach_ctx can block QEMU's main loop
151
+ * during attach - the monitor and other IO
152
+ * could be unresponsive during this time.
153
+ */
154
+ (void)qemu_poll_ns(pfds, 1, 500 * (int64_t)SCALE_MS);
155
+ goto retry_attach;
156
+ } else if (ret < 0) {
157
+ VFU_OBJECT_ERROR(o, "vfu: Failed to attach device %s to context - %s",
158
+ o->device, strerror(errno));
159
+ return;
160
+ }
161
+
162
+ o->vfu_poll_fd = vfu_get_poll_fd(o->vfu_ctx);
163
+ if (o->vfu_poll_fd < 0) {
164
+ VFU_OBJECT_ERROR(o, "vfu: Failed to get poll fd %s", o->device);
165
+ return;
166
+ }
167
+
168
+ qemu_set_fd_handler(o->vfu_poll_fd, vfu_object_ctx_run, NULL, o);
169
+}
170
+
171
/*
172
* TYPE_VFU_OBJECT depends on the availability of the 'socket' and 'device'
173
* properties. It also depends on devices instantiated in QEMU. These
174
@@ -XXX,XX +XXX,XX @@ static void vfu_object_machine_done(Notifier *notifier, void *data)
175
}
110
}
176
}
111
177
112
- if (!vu_message_read(dev, dev->slave_fd, &msg_reply)) {
178
+/**
113
+ if (!vu_message_read_default(dev, dev->slave_fd, &msg_reply)) {
179
+ * vfu_object_init_ctx: Create and initialize libvfio-user context. Add
114
goto out;
180
+ * an unplug blocker for the associated PCI device. Setup a FD handler
115
}
181
+ * to process incoming messages in the context's socket.
116
182
+ *
117
@@ -XXX,XX +XXX,XX @@ vu_set_mem_table_exec_postcopy(VuDev *dev, VhostUserMsg *vmsg)
183
+ * The socket and device properties are mandatory, and this function
118
/* Wait for QEMU to confirm that it's registered the handler for the
184
+ * will not create the context without them - the setters for these
119
* faults.
185
+ * properties should call this function when the property is set. The
120
*/
186
+ * machine should also be ready when this function is invoked - it is
121
- if (!vu_message_read(dev, dev->sock, vmsg) ||
187
+ * because QEMU objects are initialized before devices, and the
122
+ if (!dev->read_msg(dev, dev->sock, vmsg) ||
188
+ * associated PCI device wouldn't be available at the object
123
vmsg->size != sizeof(vmsg->payload.u64) ||
189
+ * initialization time. Until these conditions are satisfied, this
124
vmsg->payload.u64 != 0) {
190
+ * function would return early without performing any task.
125
vu_panic(dev, "failed to receive valid ack for postcopy set-mem-table");
191
+ */
126
@@ -XXX,XX +XXX,XX @@ vu_dispatch(VuDev *dev)
192
static void vfu_object_init_ctx(VfuObject *o, Error **errp)
127
int reply_requested;
193
{
128
bool need_reply, success = false;
194
ERRP_GUARD();
129
195
@@ -XXX,XX +XXX,XX @@ static void vfu_object_init_ctx(VfuObject *o, Error **errp)
130
- if (!vu_message_read(dev, dev->sock, &vmsg)) {
131
+ if (!dev->read_msg(dev, dev->sock, &vmsg)) {
132
goto end;
133
}
134
135
@@ -XXX,XX +XXX,XX @@ vu_init(VuDev *dev,
136
uint16_t max_queues,
137
int socket,
138
vu_panic_cb panic,
139
+ vu_read_msg_cb read_msg,
140
vu_set_watch_cb set_watch,
141
vu_remove_watch_cb remove_watch,
142
const VuDevIface *iface)
143
@@ -XXX,XX +XXX,XX @@ vu_init(VuDev *dev,
144
145
dev->sock = socket;
146
dev->panic = panic;
147
+ dev->read_msg = read_msg ? read_msg : vu_message_read_default;
148
dev->set_watch = set_watch;
149
dev->remove_watch = remove_watch;
150
dev->iface = iface;
151
@@ -XXX,XX +XXX,XX @@ static void _vu_queue_notify(VuDev *dev, VuVirtq *vq, bool sync)
152
153
vu_message_write(dev, dev->slave_fd, &vmsg);
154
if (ack) {
155
- vu_message_read(dev, dev->slave_fd, &vmsg);
156
+ vu_message_read_default(dev, dev->slave_fd, &vmsg);
157
}
196
return;
158
return;
197
}
159
}
198
160
diff --git a/tests/vhost-user-bridge.c b/tests/vhost-user-bridge.c
199
- o->vfu_ctx = vfu_create_ctx(VFU_TRANS_SOCK, o->socket->u.q_unix.path, 0,
161
index XXXXXXX..XXXXXXX 100644
200
+ o->vfu_ctx = vfu_create_ctx(VFU_TRANS_SOCK, o->socket->u.q_unix.path,
162
--- a/tests/vhost-user-bridge.c
201
+ LIBVFIO_USER_FLAG_ATTACH_NB,
163
+++ b/tests/vhost-user-bridge.c
202
o, VFU_DEV_TYPE_PCI);
164
@@ -XXX,XX +XXX,XX @@ vubr_accept_cb(int sock, void *ctx)
203
if (o->vfu_ctx == NULL) {
165
VHOST_USER_BRIDGE_MAX_QUEUES,
204
error_setg(errp, "vfu: Failed to create context - %s", strerror(errno));
166
conn_fd,
205
@@ -XXX,XX +XXX,XX @@ static void vfu_object_init_ctx(VfuObject *o, Error **errp)
167
vubr_panic,
206
TYPE_VFU_OBJECT, o->device);
168
+ NULL,
207
qdev_add_unplug_blocker(DEVICE(o->pci_dev), o->unplug_blocker);
169
vubr_set_watch,
208
170
vubr_remove_watch,
209
+ ret = vfu_realize_ctx(o->vfu_ctx);
171
&vuiface)) {
210
+ if (ret < 0) {
172
@@ -XXX,XX +XXX,XX @@ vubr_new(const char *path, bool client)
211
+ error_setg(errp, "vfu: Failed to realize device %s- %s",
173
VHOST_USER_BRIDGE_MAX_QUEUES,
212
+ o->device, strerror(errno));
174
dev->sock,
213
+ goto fail;
175
vubr_panic,
214
+ }
176
+ NULL,
215
+
177
vubr_set_watch,
216
+ o->vfu_poll_fd = vfu_get_poll_fd(o->vfu_ctx);
178
vubr_remove_watch,
217
+ if (o->vfu_poll_fd < 0) {
179
&vuiface)) {
218
+ error_setg(errp, "vfu: Failed to get poll fd %s", o->device);
180
diff --git a/tools/virtiofsd/fuse_virtio.c b/tools/virtiofsd/fuse_virtio.c
219
+ goto fail;
181
index XXXXXXX..XXXXXXX 100644
220
+ }
182
--- a/tools/virtiofsd/fuse_virtio.c
221
+
183
+++ b/tools/virtiofsd/fuse_virtio.c
222
+ qemu_set_fd_handler(o->vfu_poll_fd, vfu_object_attach_ctx, NULL, o);
184
@@ -XXX,XX +XXX,XX @@ int virtio_session_mount(struct fuse_session *se)
223
+
185
se->vu_socketfd = data_sock;
224
return;
186
se->virtio_dev->se = se;
225
187
pthread_rwlock_init(&se->virtio_dev->vu_dispatch_rwlock, NULL);
226
fail:
188
- vu_init(&se->virtio_dev->dev, 2, se->vu_socketfd, fv_panic, fv_set_watch,
227
@@ -XXX,XX +XXX,XX @@ static void vfu_object_init(Object *obj)
189
- fv_remove_watch, &fv_iface);
228
qemu_add_machine_init_done_notifier(&o->machine_done);
190
+ vu_init(&se->virtio_dev->dev, 2, se->vu_socketfd, fv_panic, NULL,
229
}
191
+ fv_set_watch, fv_remove_watch, &fv_iface);
230
192
231
+ o->vfu_poll_fd = -1;
193
return 0;
232
}
194
}
233
234
static void vfu_object_finalize(Object *obj)
235
@@ -XXX,XX +XXX,XX @@ static void vfu_object_finalize(Object *obj)
236
237
o->socket = NULL;
238
239
+ if (o->vfu_poll_fd != -1) {
240
+ qemu_set_fd_handler(o->vfu_poll_fd, NULL, NULL, NULL);
241
+ o->vfu_poll_fd = -1;
242
+ }
243
+
244
if (o->vfu_ctx) {
245
vfu_destroy_ctx(o->vfu_ctx);
246
o->vfu_ctx = NULL;
247
--
195
--
248
2.36.1
196
2.26.2
197
diff view generated by jsdifflib
New patch
1
From: Coiby Xu <coiby.xu@gmail.com>
1
2
3
When the client is running in gdb and quit command is run in gdb,
4
QEMU will still dispatch the event which will cause segment fault in
5
the callback function.
6
7
Signed-off-by: Coiby Xu <coiby.xu@gmail.com>
8
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
9
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
10
Message-id: 20200918080912.321299-3-coiby.xu@gmail.com
11
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
12
---
13
contrib/libvhost-user/libvhost-user.c | 1 +
14
1 file changed, 1 insertion(+)
15
16
diff --git a/contrib/libvhost-user/libvhost-user.c b/contrib/libvhost-user/libvhost-user.c
17
index XXXXXXX..XXXXXXX 100644
18
--- a/contrib/libvhost-user/libvhost-user.c
19
+++ b/contrib/libvhost-user/libvhost-user.c
20
@@ -XXX,XX +XXX,XX @@ vu_deinit(VuDev *dev)
21
}
22
23
if (vq->kick_fd != -1) {
24
+ dev->remove_watch(dev, vq->kick_fd);
25
close(vq->kick_fd);
26
vq->kick_fd = -1;
27
}
28
--
29
2.26.2
30
diff view generated by jsdifflib
New patch
1
From: Coiby Xu <coiby.xu@gmail.com>
1
2
3
Sharing QEMU devices via vhost-user protocol.
4
5
Only one vhost-user client can connect to the server one time.
6
7
Suggested-by: Kevin Wolf <kwolf@redhat.com>
8
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
9
Signed-off-by: Coiby Xu <coiby.xu@gmail.com>
10
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
11
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
12
Message-id: 20200918080912.321299-4-coiby.xu@gmail.com
13
[Fixed size_t %lu -> %zu format string compiler error.
14
--Stefan]
15
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
16
---
17
util/vhost-user-server.h | 65 ++++++
18
util/vhost-user-server.c | 428 +++++++++++++++++++++++++++++++++++++++
19
util/meson.build | 1 +
20
3 files changed, 494 insertions(+)
21
create mode 100644 util/vhost-user-server.h
22
create mode 100644 util/vhost-user-server.c
23
24
diff --git a/util/vhost-user-server.h b/util/vhost-user-server.h
25
new file mode 100644
26
index XXXXXXX..XXXXXXX
27
--- /dev/null
28
+++ b/util/vhost-user-server.h
29
@@ -XXX,XX +XXX,XX @@
30
+/*
31
+ * Sharing QEMU devices via vhost-user protocol
32
+ *
33
+ * Copyright (c) Coiby Xu <coiby.xu@gmail.com>.
34
+ * Copyright (c) 2020 Red Hat, Inc.
35
+ *
36
+ * This work is licensed under the terms of the GNU GPL, version 2 or
37
+ * later. See the COPYING file in the top-level directory.
38
+ */
39
+
40
+#ifndef VHOST_USER_SERVER_H
41
+#define VHOST_USER_SERVER_H
42
+
43
+#include "contrib/libvhost-user/libvhost-user.h"
44
+#include "io/channel-socket.h"
45
+#include "io/channel-file.h"
46
+#include "io/net-listener.h"
47
+#include "qemu/error-report.h"
48
+#include "qapi/error.h"
49
+#include "standard-headers/linux/virtio_blk.h"
50
+
51
+typedef struct VuFdWatch {
52
+ VuDev *vu_dev;
53
+ int fd; /*kick fd*/
54
+ void *pvt;
55
+ vu_watch_cb cb;
56
+ bool processing;
57
+ QTAILQ_ENTRY(VuFdWatch) next;
58
+} VuFdWatch;
59
+
60
+typedef struct VuServer VuServer;
61
+typedef void DevicePanicNotifierFn(VuServer *server);
62
+
63
+struct VuServer {
64
+ QIONetListener *listener;
65
+ AioContext *ctx;
66
+ DevicePanicNotifierFn *device_panic_notifier;
67
+ int max_queues;
68
+ const VuDevIface *vu_iface;
69
+ VuDev vu_dev;
70
+ QIOChannel *ioc; /* The I/O channel with the client */
71
+ QIOChannelSocket *sioc; /* The underlying data channel with the client */
72
+ /* IOChannel for fd provided via VHOST_USER_SET_SLAVE_REQ_FD */
73
+ QIOChannel *ioc_slave;
74
+ QIOChannelSocket *sioc_slave;
75
+ Coroutine *co_trip; /* coroutine for processing VhostUserMsg */
76
+ QTAILQ_HEAD(, VuFdWatch) vu_fd_watches;
77
+ /* restart coroutine co_trip if AIOContext is changed */
78
+ bool aio_context_changed;
79
+ bool processing_msg;
80
+};
81
+
82
+bool vhost_user_server_start(VuServer *server,
83
+ SocketAddress *unix_socket,
84
+ AioContext *ctx,
85
+ uint16_t max_queues,
86
+ DevicePanicNotifierFn *device_panic_notifier,
87
+ const VuDevIface *vu_iface,
88
+ Error **errp);
89
+
90
+void vhost_user_server_stop(VuServer *server);
91
+
92
+void vhost_user_server_set_aio_context(VuServer *server, AioContext *ctx);
93
+
94
+#endif /* VHOST_USER_SERVER_H */
95
diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
96
new file mode 100644
97
index XXXXXXX..XXXXXXX
98
--- /dev/null
99
+++ b/util/vhost-user-server.c
100
@@ -XXX,XX +XXX,XX @@
101
+/*
102
+ * Sharing QEMU devices via vhost-user protocol
103
+ *
104
+ * Copyright (c) Coiby Xu <coiby.xu@gmail.com>.
105
+ * Copyright (c) 2020 Red Hat, Inc.
106
+ *
107
+ * This work is licensed under the terms of the GNU GPL, version 2 or
108
+ * later. See the COPYING file in the top-level directory.
109
+ */
110
+#include "qemu/osdep.h"
111
+#include "qemu/main-loop.h"
112
+#include "vhost-user-server.h"
113
+
114
+static void vmsg_close_fds(VhostUserMsg *vmsg)
115
+{
116
+ int i;
117
+ for (i = 0; i < vmsg->fd_num; i++) {
118
+ close(vmsg->fds[i]);
119
+ }
120
+}
121
+
122
+static void vmsg_unblock_fds(VhostUserMsg *vmsg)
123
+{
124
+ int i;
125
+ for (i = 0; i < vmsg->fd_num; i++) {
126
+ qemu_set_nonblock(vmsg->fds[i]);
127
+ }
128
+}
129
+
130
+static void vu_accept(QIONetListener *listener, QIOChannelSocket *sioc,
131
+ gpointer opaque);
132
+
133
+static void close_client(VuServer *server)
134
+{
135
+ /*
136
+ * Before closing the client
137
+ *
138
+ * 1. Let vu_client_trip stop processing new vhost-user msg
139
+ *
140
+ * 2. remove kick_handler
141
+ *
142
+ * 3. wait for the kick handler to be finished
143
+ *
144
+ * 4. wait for the current vhost-user msg to be finished processing
145
+ */
146
+
147
+ QIOChannelSocket *sioc = server->sioc;
148
+ /* When this is set vu_client_trip will stop new processing vhost-user message */
149
+ server->sioc = NULL;
150
+
151
+ VuFdWatch *vu_fd_watch, *next;
152
+ QTAILQ_FOREACH_SAFE(vu_fd_watch, &server->vu_fd_watches, next, next) {
153
+ aio_set_fd_handler(server->ioc->ctx, vu_fd_watch->fd, true, NULL,
154
+ NULL, NULL, NULL);
155
+ }
156
+
157
+ while (!QTAILQ_EMPTY(&server->vu_fd_watches)) {
158
+ QTAILQ_FOREACH_SAFE(vu_fd_watch, &server->vu_fd_watches, next, next) {
159
+ if (!vu_fd_watch->processing) {
160
+ QTAILQ_REMOVE(&server->vu_fd_watches, vu_fd_watch, next);
161
+ g_free(vu_fd_watch);
162
+ }
163
+ }
164
+ }
165
+
166
+ while (server->processing_msg) {
167
+ if (server->ioc->read_coroutine) {
168
+ server->ioc->read_coroutine = NULL;
169
+ qio_channel_set_aio_fd_handler(server->ioc, server->ioc->ctx, NULL,
170
+ NULL, server->ioc);
171
+ server->processing_msg = false;
172
+ }
173
+ }
174
+
175
+ vu_deinit(&server->vu_dev);
176
+ object_unref(OBJECT(sioc));
177
+ object_unref(OBJECT(server->ioc));
178
+}
179
+
180
+static void panic_cb(VuDev *vu_dev, const char *buf)
181
+{
182
+ VuServer *server = container_of(vu_dev, VuServer, vu_dev);
183
+
184
+ /* avoid while loop in close_client */
185
+ server->processing_msg = false;
186
+
187
+ if (buf) {
188
+ error_report("vu_panic: %s", buf);
189
+ }
190
+
191
+ if (server->sioc) {
192
+ close_client(server);
193
+ }
194
+
195
+ if (server->device_panic_notifier) {
196
+ server->device_panic_notifier(server);
197
+ }
198
+
199
+ /*
200
+ * Set the callback function for network listener so another
201
+ * vhost-user client can connect to this server
202
+ */
203
+ qio_net_listener_set_client_func(server->listener,
204
+ vu_accept,
205
+ server,
206
+ NULL);
207
+}
208
+
209
+static bool coroutine_fn
210
+vu_message_read(VuDev *vu_dev, int conn_fd, VhostUserMsg *vmsg)
211
+{
212
+ struct iovec iov = {
213
+ .iov_base = (char *)vmsg,
214
+ .iov_len = VHOST_USER_HDR_SIZE,
215
+ };
216
+ int rc, read_bytes = 0;
217
+ Error *local_err = NULL;
218
+ /*
219
+ * Store fds/nfds returned from qio_channel_readv_full into
220
+ * temporary variables.
221
+ *
222
+ * VhostUserMsg is a packed structure, gcc will complain about passing
223
+ * pointer to a packed structure member if we pass &VhostUserMsg.fd_num
224
+ * and &VhostUserMsg.fds directly when calling qio_channel_readv_full,
225
+ * thus two temporary variables nfds and fds are used here.
226
+ */
227
+ size_t nfds = 0, nfds_t = 0;
228
+ const size_t max_fds = G_N_ELEMENTS(vmsg->fds);
229
+ int *fds_t = NULL;
230
+ VuServer *server = container_of(vu_dev, VuServer, vu_dev);
231
+ QIOChannel *ioc = server->ioc;
232
+
233
+ if (!ioc) {
234
+ error_report_err(local_err);
235
+ goto fail;
236
+ }
237
+
238
+ assert(qemu_in_coroutine());
239
+ do {
240
+ /*
241
+ * qio_channel_readv_full may have short reads, keeping calling it
242
+ * until getting VHOST_USER_HDR_SIZE or 0 bytes in total
243
+ */
244
+ rc = qio_channel_readv_full(ioc, &iov, 1, &fds_t, &nfds_t, &local_err);
245
+ if (rc < 0) {
246
+ if (rc == QIO_CHANNEL_ERR_BLOCK) {
247
+ qio_channel_yield(ioc, G_IO_IN);
248
+ continue;
249
+ } else {
250
+ error_report_err(local_err);
251
+ return false;
252
+ }
253
+ }
254
+ read_bytes += rc;
255
+ if (nfds_t > 0) {
256
+ if (nfds + nfds_t > max_fds) {
257
+ error_report("A maximum of %zu fds are allowed, "
258
+ "however got %zu fds now",
259
+ max_fds, nfds + nfds_t);
260
+ goto fail;
261
+ }
262
+ memcpy(vmsg->fds + nfds, fds_t,
263
+ nfds_t *sizeof(vmsg->fds[0]));
264
+ nfds += nfds_t;
265
+ g_free(fds_t);
266
+ }
267
+ if (read_bytes == VHOST_USER_HDR_SIZE || rc == 0) {
268
+ break;
269
+ }
270
+ iov.iov_base = (char *)vmsg + read_bytes;
271
+ iov.iov_len = VHOST_USER_HDR_SIZE - read_bytes;
272
+ } while (true);
273
+
274
+ vmsg->fd_num = nfds;
275
+ /* qio_channel_readv_full will make socket fds blocking, unblock them */
276
+ vmsg_unblock_fds(vmsg);
277
+ if (vmsg->size > sizeof(vmsg->payload)) {
278
+ error_report("Error: too big message request: %d, "
279
+ "size: vmsg->size: %u, "
280
+ "while sizeof(vmsg->payload) = %zu",
281
+ vmsg->request, vmsg->size, sizeof(vmsg->payload));
282
+ goto fail;
283
+ }
284
+
285
+ struct iovec iov_payload = {
286
+ .iov_base = (char *)&vmsg->payload,
287
+ .iov_len = vmsg->size,
288
+ };
289
+ if (vmsg->size) {
290
+ rc = qio_channel_readv_all_eof(ioc, &iov_payload, 1, &local_err);
291
+ if (rc == -1) {
292
+ error_report_err(local_err);
293
+ goto fail;
294
+ }
295
+ }
296
+
297
+ return true;
298
+
299
+fail:
300
+ vmsg_close_fds(vmsg);
301
+
302
+ return false;
303
+}
304
+
305
+
306
+static void vu_client_start(VuServer *server);
307
+static coroutine_fn void vu_client_trip(void *opaque)
308
+{
309
+ VuServer *server = opaque;
310
+
311
+ while (!server->aio_context_changed && server->sioc) {
312
+ server->processing_msg = true;
313
+ vu_dispatch(&server->vu_dev);
314
+ server->processing_msg = false;
315
+ }
316
+
317
+ if (server->aio_context_changed && server->sioc) {
318
+ server->aio_context_changed = false;
319
+ vu_client_start(server);
320
+ }
321
+}
322
+
323
+static void vu_client_start(VuServer *server)
324
+{
325
+ server->co_trip = qemu_coroutine_create(vu_client_trip, server);
326
+ aio_co_enter(server->ctx, server->co_trip);
327
+}
328
+
329
+/*
330
+ * a wrapper for vu_kick_cb
331
+ *
332
+ * since aio_dispatch can only pass one user data pointer to the
333
+ * callback function, pack VuDev and pvt into a struct. Then unpack it
334
+ * and pass them to vu_kick_cb
335
+ */
336
+static void kick_handler(void *opaque)
337
+{
338
+ VuFdWatch *vu_fd_watch = opaque;
339
+ vu_fd_watch->processing = true;
340
+ vu_fd_watch->cb(vu_fd_watch->vu_dev, 0, vu_fd_watch->pvt);
341
+ vu_fd_watch->processing = false;
342
+}
343
+
344
+
345
+static VuFdWatch *find_vu_fd_watch(VuServer *server, int fd)
346
+{
347
+
348
+ VuFdWatch *vu_fd_watch, *next;
349
+ QTAILQ_FOREACH_SAFE(vu_fd_watch, &server->vu_fd_watches, next, next) {
350
+ if (vu_fd_watch->fd == fd) {
351
+ return vu_fd_watch;
352
+ }
353
+ }
354
+ return NULL;
355
+}
356
+
357
+static void
358
+set_watch(VuDev *vu_dev, int fd, int vu_evt,
359
+ vu_watch_cb cb, void *pvt)
360
+{
361
+
362
+ VuServer *server = container_of(vu_dev, VuServer, vu_dev);
363
+ g_assert(vu_dev);
364
+ g_assert(fd >= 0);
365
+ g_assert(cb);
366
+
367
+ VuFdWatch *vu_fd_watch = find_vu_fd_watch(server, fd);
368
+
369
+ if (!vu_fd_watch) {
370
+ VuFdWatch *vu_fd_watch = g_new0(VuFdWatch, 1);
371
+
372
+ QTAILQ_INSERT_TAIL(&server->vu_fd_watches, vu_fd_watch, next);
373
+
374
+ vu_fd_watch->fd = fd;
375
+ vu_fd_watch->cb = cb;
376
+ qemu_set_nonblock(fd);
377
+ aio_set_fd_handler(server->ioc->ctx, fd, true, kick_handler,
378
+ NULL, NULL, vu_fd_watch);
379
+ vu_fd_watch->vu_dev = vu_dev;
380
+ vu_fd_watch->pvt = pvt;
381
+ }
382
+}
383
+
384
+
385
+static void remove_watch(VuDev *vu_dev, int fd)
386
+{
387
+ VuServer *server;
388
+ g_assert(vu_dev);
389
+ g_assert(fd >= 0);
390
+
391
+ server = container_of(vu_dev, VuServer, vu_dev);
392
+
393
+ VuFdWatch *vu_fd_watch = find_vu_fd_watch(server, fd);
394
+
395
+ if (!vu_fd_watch) {
396
+ return;
397
+ }
398
+ aio_set_fd_handler(server->ioc->ctx, fd, true, NULL, NULL, NULL, NULL);
399
+
400
+ QTAILQ_REMOVE(&server->vu_fd_watches, vu_fd_watch, next);
401
+ g_free(vu_fd_watch);
402
+}
403
+
404
+
405
+static void vu_accept(QIONetListener *listener, QIOChannelSocket *sioc,
406
+ gpointer opaque)
407
+{
408
+ VuServer *server = opaque;
409
+
410
+ if (server->sioc) {
411
+ warn_report("Only one vhost-user client is allowed to "
412
+ "connect the server one time");
413
+ return;
414
+ }
415
+
416
+ if (!vu_init(&server->vu_dev, server->max_queues, sioc->fd, panic_cb,
417
+ vu_message_read, set_watch, remove_watch, server->vu_iface)) {
418
+ error_report("Failed to initialize libvhost-user");
419
+ return;
420
+ }
421
+
422
+ /*
423
+ * Unset the callback function for network listener to make another
424
+ * vhost-user client keeping waiting until this client disconnects
425
+ */
426
+ qio_net_listener_set_client_func(server->listener,
427
+ NULL,
428
+ NULL,
429
+ NULL);
430
+ server->sioc = sioc;
431
+ /*
432
+ * Increase the object reference, so sioc will not freed by
433
+ * qio_net_listener_channel_func which will call object_unref(OBJECT(sioc))
434
+ */
435
+ object_ref(OBJECT(server->sioc));
436
+ qio_channel_set_name(QIO_CHANNEL(sioc), "vhost-user client");
437
+ server->ioc = QIO_CHANNEL(sioc);
438
+ object_ref(OBJECT(server->ioc));
439
+ qio_channel_attach_aio_context(server->ioc, server->ctx);
440
+ qio_channel_set_blocking(QIO_CHANNEL(server->sioc), false, NULL);
441
+ vu_client_start(server);
442
+}
443
+
444
+
445
+void vhost_user_server_stop(VuServer *server)
446
+{
447
+ if (server->sioc) {
448
+ close_client(server);
449
+ }
450
+
451
+ if (server->listener) {
452
+ qio_net_listener_disconnect(server->listener);
453
+ object_unref(OBJECT(server->listener));
454
+ }
455
+
456
+}
457
+
458
+void vhost_user_server_set_aio_context(VuServer *server, AioContext *ctx)
459
+{
460
+ VuFdWatch *vu_fd_watch, *next;
461
+ void *opaque = NULL;
462
+ IOHandler *io_read = NULL;
463
+ bool attach;
464
+
465
+ server->ctx = ctx ? ctx : qemu_get_aio_context();
466
+
467
+ if (!server->sioc) {
468
+ /* not yet serving any client*/
469
+ return;
470
+ }
471
+
472
+ if (ctx) {
473
+ qio_channel_attach_aio_context(server->ioc, ctx);
474
+ server->aio_context_changed = true;
475
+ io_read = kick_handler;
476
+ attach = true;
477
+ } else {
478
+ qio_channel_detach_aio_context(server->ioc);
479
+ /* server->ioc->ctx keeps the old AioConext */
480
+ ctx = server->ioc->ctx;
481
+ attach = false;
482
+ }
483
+
484
+ QTAILQ_FOREACH_SAFE(vu_fd_watch, &server->vu_fd_watches, next, next) {
485
+ if (vu_fd_watch->cb) {
486
+ opaque = attach ? vu_fd_watch : NULL;
487
+ aio_set_fd_handler(ctx, vu_fd_watch->fd, true,
488
+ io_read, NULL, NULL,
489
+ opaque);
490
+ }
491
+ }
492
+}
493
+
494
+
495
+bool vhost_user_server_start(VuServer *server,
496
+ SocketAddress *socket_addr,
497
+ AioContext *ctx,
498
+ uint16_t max_queues,
499
+ DevicePanicNotifierFn *device_panic_notifier,
500
+ const VuDevIface *vu_iface,
501
+ Error **errp)
502
+{
503
+ QIONetListener *listener = qio_net_listener_new();
504
+ if (qio_net_listener_open_sync(listener, socket_addr, 1,
505
+ errp) < 0) {
506
+ object_unref(OBJECT(listener));
507
+ return false;
508
+ }
509
+
510
+ /* zero out unspecified fileds */
511
+ *server = (VuServer) {
512
+ .listener = listener,
513
+ .vu_iface = vu_iface,
514
+ .max_queues = max_queues,
515
+ .ctx = ctx,
516
+ .device_panic_notifier = device_panic_notifier,
517
+ };
518
+
519
+ qio_net_listener_set_name(server->listener, "vhost-user-backend-listener");
520
+
521
+ qio_net_listener_set_client_func(server->listener,
522
+ vu_accept,
523
+ server,
524
+ NULL);
525
+
526
+ QTAILQ_INIT(&server->vu_fd_watches);
527
+ return true;
528
+}
529
diff --git a/util/meson.build b/util/meson.build
530
index XXXXXXX..XXXXXXX 100644
531
--- a/util/meson.build
532
+++ b/util/meson.build
533
@@ -XXX,XX +XXX,XX @@ if have_block
534
util_ss.add(files('main-loop.c'))
535
util_ss.add(files('nvdimm-utils.c'))
536
util_ss.add(files('qemu-coroutine.c', 'qemu-coroutine-lock.c', 'qemu-coroutine-io.c'))
537
+ util_ss.add(when: 'CONFIG_LINUX', if_true: files('vhost-user-server.c'))
538
util_ss.add(files('qemu-coroutine-sleep.c'))
539
util_ss.add(files('qemu-co-shared-resource.c'))
540
util_ss.add(files('thread-pool.c', 'qemu-timer.c'))
541
--
542
2.26.2
543
diff view generated by jsdifflib
1
From: Jagannathan Raman <jag.raman@oracle.com>
1
From: Coiby Xu <coiby.xu@gmail.com>
2
2
3
Define vfio-user object which is remote process server for QEMU. Setup
3
Move the constants from hw/core/qdev-properties.c to
4
object initialization functions and properties necessary to instantiate
4
util/block-helpers.h so that knowledge of the min/max values is
5
the object
6
5
7
Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
6
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
8
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
7
Signed-off-by: Coiby Xu <coiby.xu@gmail.com>
9
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
10
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
8
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
11
Message-id: e45a17001e9b38f451543a664ababdf860e5f2f2.1655151679.git.jag.raman@oracle.com
9
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
10
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
11
Message-id: 20200918080912.321299-5-coiby.xu@gmail.com
12
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
12
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
13
---
13
---
14
MAINTAINERS | 1 +
14
util/block-helpers.h | 19 +++++++++++++
15
qapi/qom.json | 20 +++-
15
hw/core/qdev-properties-system.c | 31 ++++-----------------
16
include/hw/remote/machine.h | 2 +
16
util/block-helpers.c | 46 ++++++++++++++++++++++++++++++++
17
hw/remote/machine.c | 27 +++++
17
util/meson.build | 1 +
18
hw/remote/vfio-user-obj.c | 210 ++++++++++++++++++++++++++++++++++++
18
4 files changed, 71 insertions(+), 26 deletions(-)
19
hw/remote/meson.build | 1 +
19
create mode 100644 util/block-helpers.h
20
hw/remote/trace-events | 3 +
20
create mode 100644 util/block-helpers.c
21
7 files changed, 262 insertions(+), 2 deletions(-)
22
create mode 100644 hw/remote/vfio-user-obj.c
23
21
24
diff --git a/MAINTAINERS b/MAINTAINERS
22
diff --git a/util/block-helpers.h b/util/block-helpers.h
25
index XXXXXXX..XXXXXXX 100644
26
--- a/MAINTAINERS
27
+++ b/MAINTAINERS
28
@@ -XXX,XX +XXX,XX @@ F: include/hw/remote/proxy-memory-listener.h
29
F: hw/remote/iohub.c
30
F: include/hw/remote/iohub.h
31
F: subprojects/libvfio-user
32
+F: hw/remote/vfio-user-obj.c
33
34
EBPF:
35
M: Jason Wang <jasowang@redhat.com>
36
diff --git a/qapi/qom.json b/qapi/qom.json
37
index XXXXXXX..XXXXXXX 100644
38
--- a/qapi/qom.json
39
+++ b/qapi/qom.json
40
@@ -XXX,XX +XXX,XX @@
41
{ 'struct': 'RemoteObjectProperties',
42
'data': { 'fd': 'str', 'devid': 'str' } }
43
44
+##
45
+# @VfioUserServerProperties:
46
+#
47
+# Properties for x-vfio-user-server objects.
48
+#
49
+# @socket: socket to be used by the libvfio-user library
50
+#
51
+# @device: the ID of the device to be emulated at the server
52
+#
53
+# Since: 7.1
54
+##
55
+{ 'struct': 'VfioUserServerProperties',
56
+ 'data': { 'socket': 'SocketAddress', 'device': 'str' } }
57
+
58
##
59
# @RngProperties:
60
#
61
@@ -XXX,XX +XXX,XX @@
62
'tls-creds-psk',
63
'tls-creds-x509',
64
'tls-cipher-suites',
65
- { 'name': 'x-remote-object', 'features': [ 'unstable' ] }
66
+ { 'name': 'x-remote-object', 'features': [ 'unstable' ] },
67
+ { 'name': 'x-vfio-user-server', 'features': [ 'unstable' ] }
68
] }
69
70
##
71
@@ -XXX,XX +XXX,XX @@
72
'tls-creds-psk': 'TlsCredsPskProperties',
73
'tls-creds-x509': 'TlsCredsX509Properties',
74
'tls-cipher-suites': 'TlsCredsProperties',
75
- 'x-remote-object': 'RemoteObjectProperties'
76
+ 'x-remote-object': 'RemoteObjectProperties',
77
+ 'x-vfio-user-server': 'VfioUserServerProperties'
78
} }
79
80
##
81
diff --git a/include/hw/remote/machine.h b/include/hw/remote/machine.h
82
index XXXXXXX..XXXXXXX 100644
83
--- a/include/hw/remote/machine.h
84
+++ b/include/hw/remote/machine.h
85
@@ -XXX,XX +XXX,XX @@ struct RemoteMachineState {
86
RemoteIOHubState iohub;
87
88
bool vfio_user;
89
+
90
+ bool auto_shutdown;
91
};
92
93
/* Used to pass to co-routine device and ioc. */
94
diff --git a/hw/remote/machine.c b/hw/remote/machine.c
95
index XXXXXXX..XXXXXXX 100644
96
--- a/hw/remote/machine.c
97
+++ b/hw/remote/machine.c
98
@@ -XXX,XX +XXX,XX @@ static void remote_machine_set_vfio_user(Object *obj, bool value, Error **errp)
99
s->vfio_user = value;
100
}
101
102
+static bool remote_machine_get_auto_shutdown(Object *obj, Error **errp)
103
+{
104
+ RemoteMachineState *s = REMOTE_MACHINE(obj);
105
+
106
+ return s->auto_shutdown;
107
+}
108
+
109
+static void remote_machine_set_auto_shutdown(Object *obj, bool value,
110
+ Error **errp)
111
+{
112
+ RemoteMachineState *s = REMOTE_MACHINE(obj);
113
+
114
+ s->auto_shutdown = value;
115
+}
116
+
117
+static void remote_machine_instance_init(Object *obj)
118
+{
119
+ RemoteMachineState *s = REMOTE_MACHINE(obj);
120
+
121
+ s->auto_shutdown = true;
122
+}
123
+
124
static void remote_machine_class_init(ObjectClass *oc, void *data)
125
{
126
MachineClass *mc = MACHINE_CLASS(oc);
127
@@ -XXX,XX +XXX,XX @@ static void remote_machine_class_init(ObjectClass *oc, void *data)
128
object_class_property_add_bool(oc, "vfio-user",
129
remote_machine_get_vfio_user,
130
remote_machine_set_vfio_user);
131
+
132
+ object_class_property_add_bool(oc, "auto-shutdown",
133
+ remote_machine_get_auto_shutdown,
134
+ remote_machine_set_auto_shutdown);
135
}
136
137
static const TypeInfo remote_machine = {
138
.name = TYPE_REMOTE_MACHINE,
139
.parent = TYPE_MACHINE,
140
.instance_size = sizeof(RemoteMachineState),
141
+ .instance_init = remote_machine_instance_init,
142
.class_init = remote_machine_class_init,
143
.interfaces = (InterfaceInfo[]) {
144
{ TYPE_HOTPLUG_HANDLER },
145
diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c
146
new file mode 100644
23
new file mode 100644
147
index XXXXXXX..XXXXXXX
24
index XXXXXXX..XXXXXXX
148
--- /dev/null
25
--- /dev/null
149
+++ b/hw/remote/vfio-user-obj.c
26
+++ b/util/block-helpers.h
150
@@ -XXX,XX +XXX,XX @@
27
@@ -XXX,XX +XXX,XX @@
151
+/**
28
+#ifndef BLOCK_HELPERS_H
152
+ * QEMU vfio-user-server server object
29
+#define BLOCK_HELPERS_H
30
+
31
+#include "qemu/units.h"
32
+
33
+/* lower limit is sector size */
34
+#define MIN_BLOCK_SIZE INT64_C(512)
35
+#define MIN_BLOCK_SIZE_STR "512 B"
36
+/*
37
+ * upper limit is arbitrary, 2 MiB looks sufficient for all sensible uses, and
38
+ * matches qcow2 cluster size limit
39
+ */
40
+#define MAX_BLOCK_SIZE (2 * MiB)
41
+#define MAX_BLOCK_SIZE_STR "2 MiB"
42
+
43
+void check_block_size(const char *id, const char *name, int64_t value,
44
+ Error **errp);
45
+
46
+#endif /* BLOCK_HELPERS_H */
47
diff --git a/hw/core/qdev-properties-system.c b/hw/core/qdev-properties-system.c
48
index XXXXXXX..XXXXXXX 100644
49
--- a/hw/core/qdev-properties-system.c
50
+++ b/hw/core/qdev-properties-system.c
51
@@ -XXX,XX +XXX,XX @@
52
#include "sysemu/blockdev.h"
53
#include "net/net.h"
54
#include "hw/pci/pci.h"
55
+#include "util/block-helpers.h"
56
57
static bool check_prop_still_unset(DeviceState *dev, const char *name,
58
const void *old_val, const char *new_val,
59
@@ -XXX,XX +XXX,XX @@ const PropertyInfo qdev_prop_losttickpolicy = {
60
61
/* --- blocksize --- */
62
63
-/* lower limit is sector size */
64
-#define MIN_BLOCK_SIZE 512
65
-#define MIN_BLOCK_SIZE_STR "512 B"
66
-/*
67
- * upper limit is arbitrary, 2 MiB looks sufficient for all sensible uses, and
68
- * matches qcow2 cluster size limit
69
- */
70
-#define MAX_BLOCK_SIZE (2 * MiB)
71
-#define MAX_BLOCK_SIZE_STR "2 MiB"
72
-
73
static void set_blocksize(Object *obj, Visitor *v, const char *name,
74
void *opaque, Error **errp)
75
{
76
@@ -XXX,XX +XXX,XX @@ static void set_blocksize(Object *obj, Visitor *v, const char *name,
77
Property *prop = opaque;
78
uint32_t *ptr = qdev_get_prop_ptr(dev, prop);
79
uint64_t value;
80
+ Error *local_err = NULL;
81
82
if (dev->realized) {
83
qdev_prop_set_after_realize(dev, name, errp);
84
@@ -XXX,XX +XXX,XX @@ static void set_blocksize(Object *obj, Visitor *v, const char *name,
85
if (!visit_type_size(v, name, &value, errp)) {
86
return;
87
}
88
- /* value of 0 means "unset" */
89
- if (value && (value < MIN_BLOCK_SIZE || value > MAX_BLOCK_SIZE)) {
90
- error_setg(errp,
91
- "Property %s.%s doesn't take value %" PRIu64
92
- " (minimum: " MIN_BLOCK_SIZE_STR
93
- ", maximum: " MAX_BLOCK_SIZE_STR ")",
94
- dev->id ? : "", name, value);
95
+ check_block_size(dev->id ? : "", name, value, &local_err);
96
+ if (local_err) {
97
+ error_propagate(errp, local_err);
98
return;
99
}
100
-
101
- /* We rely on power-of-2 blocksizes for bitmasks */
102
- if ((value & (value - 1)) != 0) {
103
- error_setg(errp,
104
- "Property %s.%s doesn't take value '%" PRId64 "', "
105
- "it's not a power of 2", dev->id ?: "", name, (int64_t)value);
106
- return;
107
- }
108
-
109
*ptr = value;
110
}
111
112
diff --git a/util/block-helpers.c b/util/block-helpers.c
113
new file mode 100644
114
index XXXXXXX..XXXXXXX
115
--- /dev/null
116
+++ b/util/block-helpers.c
117
@@ -XXX,XX +XXX,XX @@
118
+/*
119
+ * Block utility functions
153
+ *
120
+ *
154
+ * Copyright © 2022 Oracle and/or its affiliates.
121
+ * Copyright IBM, Corp. 2011
122
+ * Copyright (c) 2020 Coiby Xu <coiby.xu@gmail.com>
155
+ *
123
+ *
156
+ * This work is licensed under the terms of the GNU GPL-v2, version 2 or later.
124
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
157
+ *
158
+ * See the COPYING file in the top-level directory.
125
+ * See the COPYING file in the top-level directory.
159
+ *
160
+ */
161
+
162
+/**
163
+ * Usage: add options:
164
+ * -machine x-remote,vfio-user=on,auto-shutdown=on
165
+ * -device <PCI-device>,id=<pci-dev-id>
166
+ * -object x-vfio-user-server,id=<id>,type=unix,path=<socket-path>,
167
+ * device=<pci-dev-id>
168
+ *
169
+ * Note that x-vfio-user-server object must be used with x-remote machine only.
170
+ * This server could only support PCI devices for now.
171
+ *
172
+ * type - SocketAddress type - presently "unix" alone is supported. Required
173
+ * option
174
+ *
175
+ * path - named unix socket, it will be created by the server. It is
176
+ * a required option
177
+ *
178
+ * device - id of a device on the server, a required option. PCI devices
179
+ * alone are supported presently.
180
+ */
126
+ */
181
+
127
+
182
+#include "qemu/osdep.h"
128
+#include "qemu/osdep.h"
183
+
184
+#include "qom/object.h"
185
+#include "qom/object_interfaces.h"
186
+#include "qemu/error-report.h"
187
+#include "trace.h"
188
+#include "sysemu/runstate.h"
189
+#include "hw/boards.h"
190
+#include "hw/remote/machine.h"
191
+#include "qapi/error.h"
129
+#include "qapi/error.h"
192
+#include "qapi/qapi-visit-sockets.h"
130
+#include "qapi/qmp/qerror.h"
193
+
131
+#include "block-helpers.h"
194
+#define TYPE_VFU_OBJECT "x-vfio-user-server"
195
+OBJECT_DECLARE_TYPE(VfuObject, VfuObjectClass, VFU_OBJECT)
196
+
132
+
197
+/**
133
+/**
198
+ * VFU_OBJECT_ERROR - reports an error message. If auto_shutdown
134
+ * check_block_size:
199
+ * is set, it aborts the machine on error. Otherwise, it logs an
135
+ * @id: The unique ID of the object
200
+ * error message without aborting.
136
+ * @name: The name of the property being validated
137
+ * @value: The block size in bytes
138
+ * @errp: A pointer to an area to store an error
139
+ *
140
+ * This function checks that the block size meets the following conditions:
141
+ * 1. At least MIN_BLOCK_SIZE
142
+ * 2. No larger than MAX_BLOCK_SIZE
143
+ * 3. A power of 2
201
+ */
144
+ */
202
+#define VFU_OBJECT_ERROR(o, fmt, ...) \
145
+void check_block_size(const char *id, const char *name, int64_t value,
203
+ { \
146
+ Error **errp)
204
+ if (vfu_object_auto_shutdown()) { \
205
+ error_setg(&error_abort, (fmt), ## __VA_ARGS__); \
206
+ } else { \
207
+ error_report((fmt), ## __VA_ARGS__); \
208
+ } \
209
+ } \
210
+
211
+struct VfuObjectClass {
212
+ ObjectClass parent_class;
213
+
214
+ unsigned int nr_devs;
215
+};
216
+
217
+struct VfuObject {
218
+ /* private */
219
+ Object parent;
220
+
221
+ SocketAddress *socket;
222
+
223
+ char *device;
224
+
225
+ Error *err;
226
+};
227
+
228
+static bool vfu_object_auto_shutdown(void)
229
+{
147
+{
230
+ bool auto_shutdown = true;
148
+ /* value of 0 means "unset" */
231
+ Error *local_err = NULL;
149
+ if (value && (value < MIN_BLOCK_SIZE || value > MAX_BLOCK_SIZE)) {
232
+
150
+ error_setg(errp, QERR_PROPERTY_VALUE_OUT_OF_RANGE,
233
+ if (!current_machine) {
151
+ id, name, value, MIN_BLOCK_SIZE, MAX_BLOCK_SIZE);
234
+ return auto_shutdown;
235
+ }
236
+
237
+ auto_shutdown = object_property_get_bool(OBJECT(current_machine),
238
+ "auto-shutdown",
239
+ &local_err);
240
+
241
+ /*
242
+ * local_err would be set if no such property exists - safe to ignore.
243
+ * Unlikely scenario as auto-shutdown is always defined for
244
+ * TYPE_REMOTE_MACHINE, and TYPE_VFU_OBJECT only works with
245
+ * TYPE_REMOTE_MACHINE
246
+ */
247
+ if (local_err) {
248
+ auto_shutdown = true;
249
+ error_free(local_err);
250
+ }
251
+
252
+ return auto_shutdown;
253
+}
254
+
255
+static void vfu_object_set_socket(Object *obj, Visitor *v, const char *name,
256
+ void *opaque, Error **errp)
257
+{
258
+ VfuObject *o = VFU_OBJECT(obj);
259
+
260
+ qapi_free_SocketAddress(o->socket);
261
+
262
+ o->socket = NULL;
263
+
264
+ visit_type_SocketAddress(v, name, &o->socket, errp);
265
+
266
+ if (o->socket->type != SOCKET_ADDRESS_TYPE_UNIX) {
267
+ error_setg(errp, "vfu: Unsupported socket type - %s",
268
+ SocketAddressType_str(o->socket->type));
269
+ qapi_free_SocketAddress(o->socket);
270
+ o->socket = NULL;
271
+ return;
152
+ return;
272
+ }
153
+ }
273
+
154
+
274
+ trace_vfu_prop("socket", o->socket->u.q_unix.path);
155
+ /* We rely on power-of-2 blocksizes for bitmasks */
275
+}
156
+ if ((value & (value - 1)) != 0) {
276
+
157
+ error_setg(errp,
277
+static void vfu_object_set_device(Object *obj, const char *str, Error **errp)
158
+ "Property %s.%s doesn't take value '%" PRId64
278
+{
159
+ "', it's not a power of 2",
279
+ VfuObject *o = VFU_OBJECT(obj);
160
+ id, name, value);
280
+
281
+ g_free(o->device);
282
+
283
+ o->device = g_strdup(str);
284
+
285
+ trace_vfu_prop("device", str);
286
+}
287
+
288
+static void vfu_object_init(Object *obj)
289
+{
290
+ VfuObjectClass *k = VFU_OBJECT_GET_CLASS(obj);
291
+ VfuObject *o = VFU_OBJECT(obj);
292
+
293
+ k->nr_devs++;
294
+
295
+ if (!object_dynamic_cast(OBJECT(current_machine), TYPE_REMOTE_MACHINE)) {
296
+ error_setg(&o->err, "vfu: %s only compatible with %s machine",
297
+ TYPE_VFU_OBJECT, TYPE_REMOTE_MACHINE);
298
+ return;
161
+ return;
299
+ }
162
+ }
300
+}
163
+}
301
+
164
diff --git a/util/meson.build b/util/meson.build
302
+static void vfu_object_finalize(Object *obj)
303
+{
304
+ VfuObjectClass *k = VFU_OBJECT_GET_CLASS(obj);
305
+ VfuObject *o = VFU_OBJECT(obj);
306
+
307
+ k->nr_devs--;
308
+
309
+ qapi_free_SocketAddress(o->socket);
310
+
311
+ o->socket = NULL;
312
+
313
+ g_free(o->device);
314
+
315
+ o->device = NULL;
316
+
317
+ if (!k->nr_devs && vfu_object_auto_shutdown()) {
318
+ qemu_system_shutdown_request(SHUTDOWN_CAUSE_GUEST_SHUTDOWN);
319
+ }
320
+}
321
+
322
+static void vfu_object_class_init(ObjectClass *klass, void *data)
323
+{
324
+ VfuObjectClass *k = VFU_OBJECT_CLASS(klass);
325
+
326
+ k->nr_devs = 0;
327
+
328
+ object_class_property_add(klass, "socket", "SocketAddress", NULL,
329
+ vfu_object_set_socket, NULL, NULL);
330
+ object_class_property_set_description(klass, "socket",
331
+ "SocketAddress "
332
+ "(ex: type=unix,path=/tmp/sock). "
333
+ "Only UNIX is presently supported");
334
+ object_class_property_add_str(klass, "device", NULL,
335
+ vfu_object_set_device);
336
+ object_class_property_set_description(klass, "device",
337
+ "device ID - only PCI devices "
338
+ "are presently supported");
339
+}
340
+
341
+static const TypeInfo vfu_object_info = {
342
+ .name = TYPE_VFU_OBJECT,
343
+ .parent = TYPE_OBJECT,
344
+ .instance_size = sizeof(VfuObject),
345
+ .instance_init = vfu_object_init,
346
+ .instance_finalize = vfu_object_finalize,
347
+ .class_size = sizeof(VfuObjectClass),
348
+ .class_init = vfu_object_class_init,
349
+ .interfaces = (InterfaceInfo[]) {
350
+ { TYPE_USER_CREATABLE },
351
+ { }
352
+ }
353
+};
354
+
355
+static void vfu_register_types(void)
356
+{
357
+ type_register_static(&vfu_object_info);
358
+}
359
+
360
+type_init(vfu_register_types);
361
diff --git a/hw/remote/meson.build b/hw/remote/meson.build
362
index XXXXXXX..XXXXXXX 100644
165
index XXXXXXX..XXXXXXX 100644
363
--- a/hw/remote/meson.build
166
--- a/util/meson.build
364
+++ b/hw/remote/meson.build
167
+++ b/util/meson.build
365
@@ -XXX,XX +XXX,XX @@ remote_ss.add(when: 'CONFIG_MULTIPROCESS', if_true: files('message.c'))
168
@@ -XXX,XX +XXX,XX @@ if have_block
366
remote_ss.add(when: 'CONFIG_MULTIPROCESS', if_true: files('remote-obj.c'))
169
util_ss.add(files('nvdimm-utils.c'))
367
remote_ss.add(when: 'CONFIG_MULTIPROCESS', if_true: files('proxy.c'))
170
util_ss.add(files('qemu-coroutine.c', 'qemu-coroutine-lock.c', 'qemu-coroutine-io.c'))
368
remote_ss.add(when: 'CONFIG_MULTIPROCESS', if_true: files('iohub.c'))
171
util_ss.add(when: 'CONFIG_LINUX', if_true: files('vhost-user-server.c'))
369
+remote_ss.add(when: 'CONFIG_VFIO_USER_SERVER', if_true: files('vfio-user-obj.c'))
172
+ util_ss.add(files('block-helpers.c'))
370
173
util_ss.add(files('qemu-coroutine-sleep.c'))
371
remote_ss.add(when: 'CONFIG_VFIO_USER_SERVER', if_true: libvfio_user_dep)
174
util_ss.add(files('qemu-co-shared-resource.c'))
372
175
util_ss.add(files('thread-pool.c', 'qemu-timer.c'))
373
diff --git a/hw/remote/trace-events b/hw/remote/trace-events
374
index XXXXXXX..XXXXXXX 100644
375
--- a/hw/remote/trace-events
376
+++ b/hw/remote/trace-events
377
@@ -XXX,XX +XXX,XX @@
378
379
mpqemu_send_io_error(int cmd, int size, int nfds) "send command %d size %d, %d file descriptors to remote process"
380
mpqemu_recv_io_error(int cmd, int size, int nfds) "failed to receive %d size %d, %d file descriptors to remote process"
381
+
382
+# vfio-user-obj.c
383
+vfu_prop(const char *prop, const char *val) "vfu: setting %s as %s"
384
--
176
--
385
2.36.1
177
2.26.2
386
178
387
diff view generated by jsdifflib
1
From: Jagannathan Raman <jag.raman@oracle.com>
1
From: Coiby Xu <coiby.xu@gmail.com>
2
2
3
Assign separate address space for each device in the remote processes.
3
By making use of libvhost-user, block device drive can be shared to
4
the connected vhost-user client. Only one client can connect to the
5
server one time.
4
6
5
Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
7
Since vhost-user-server needs a block drive to be created first, delay
6
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
8
the creation of this object.
7
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
9
10
Suggested-by: Kevin Wolf <kwolf@redhat.com>
11
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
12
Signed-off-by: Coiby Xu <coiby.xu@gmail.com>
8
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
13
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
9
Message-id: afe0b0a97582cdad42b5b25636a29c523265a10a.1655151679.git.jag.raman@oracle.com
14
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
15
Message-id: 20200918080912.321299-6-coiby.xu@gmail.com
16
[Shorten "vhost_user_blk_server" string to "vhost_user_blk" to avoid the
17
following compiler warning:
18
../block/export/vhost-user-blk-server.c:178:50: error: ‘%s’ directive output truncated writing 21 bytes into a region of size 20 [-Werror=format-truncation=]
19
and fix "Invalid size %ld ..." ssize_t format string arguments for
20
32-bit hosts.
21
--Stefan]
10
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
22
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
11
---
23
---
12
MAINTAINERS | 2 +
24
block/export/vhost-user-blk-server.h | 36 ++
13
include/hw/remote/iommu.h | 40 ++++++++++++
25
block/export/vhost-user-blk-server.c | 661 +++++++++++++++++++++++++++
14
hw/remote/iommu.c | 131 ++++++++++++++++++++++++++++++++++++++
26
softmmu/vl.c | 4 +
15
hw/remote/machine.c | 13 +++-
27
block/meson.build | 1 +
16
hw/remote/meson.build | 1 +
28
4 files changed, 702 insertions(+)
17
5 files changed, 186 insertions(+), 1 deletion(-)
29
create mode 100644 block/export/vhost-user-blk-server.h
18
create mode 100644 include/hw/remote/iommu.h
30
create mode 100644 block/export/vhost-user-blk-server.c
19
create mode 100644 hw/remote/iommu.c
20
31
21
diff --git a/MAINTAINERS b/MAINTAINERS
32
diff --git a/block/export/vhost-user-blk-server.h b/block/export/vhost-user-blk-server.h
22
index XXXXXXX..XXXXXXX 100644
23
--- a/MAINTAINERS
24
+++ b/MAINTAINERS
25
@@ -XXX,XX +XXX,XX @@ F: hw/remote/iohub.c
26
F: include/hw/remote/iohub.h
27
F: subprojects/libvfio-user
28
F: hw/remote/vfio-user-obj.c
29
+F: hw/remote/iommu.c
30
+F: include/hw/remote/iommu.h
31
32
EBPF:
33
M: Jason Wang <jasowang@redhat.com>
34
diff --git a/include/hw/remote/iommu.h b/include/hw/remote/iommu.h
35
new file mode 100644
33
new file mode 100644
36
index XXXXXXX..XXXXXXX
34
index XXXXXXX..XXXXXXX
37
--- /dev/null
35
--- /dev/null
38
+++ b/include/hw/remote/iommu.h
36
+++ b/block/export/vhost-user-blk-server.h
39
@@ -XXX,XX +XXX,XX @@
37
@@ -XXX,XX +XXX,XX @@
40
+/**
38
+/*
41
+ * Copyright © 2022 Oracle and/or its affiliates.
39
+ * Sharing QEMU block devices via vhost-user protocal
42
+ *
40
+ *
43
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
41
+ * Copyright (c) Coiby Xu <coiby.xu@gmail.com>.
44
+ * See the COPYING file in the top-level directory.
42
+ * Copyright (c) 2020 Red Hat, Inc.
45
+ *
43
+ *
44
+ * This work is licensed under the terms of the GNU GPL, version 2 or
45
+ * later. See the COPYING file in the top-level directory.
46
+ */
46
+ */
47
+
47
+
48
+#ifndef REMOTE_IOMMU_H
48
+#ifndef VHOST_USER_BLK_SERVER_H
49
+#define REMOTE_IOMMU_H
49
+#define VHOST_USER_BLK_SERVER_H
50
+
50
+#include "util/vhost-user-server.h"
51
+#include "hw/pci/pci_bus.h"
51
+
52
+#include "hw/pci/pci.h"
52
+typedef struct VuBlockDev VuBlockDev;
53
+
53
+#define TYPE_VHOST_USER_BLK_SERVER "vhost-user-blk-server"
54
+#ifndef INT2VOIDP
54
+#define VHOST_USER_BLK_SERVER(obj) \
55
+#define INT2VOIDP(i) (void *)(uintptr_t)(i)
55
+ OBJECT_CHECK(VuBlockDev, obj, TYPE_VHOST_USER_BLK_SERVER)
56
+#endif
56
+
57
+
57
+/* vhost user block device */
58
+typedef struct RemoteIommuElem {
58
+struct VuBlockDev {
59
+ MemoryRegion *mr;
59
+ Object parent_obj;
60
+
60
+ char *node_name;
61
+ AddressSpace as;
61
+ SocketAddress *addr;
62
+} RemoteIommuElem;
62
+ AioContext *ctx;
63
+
63
+ VuServer vu_server;
64
+#define TYPE_REMOTE_IOMMU "x-remote-iommu"
64
+ bool running;
65
+OBJECT_DECLARE_SIMPLE_TYPE(RemoteIommu, REMOTE_IOMMU)
65
+ uint32_t blk_size;
66
+
66
+ BlockBackend *backend;
67
+struct RemoteIommu {
67
+ QIOChannelSocket *sioc;
68
+ Object parent;
68
+ QTAILQ_ENTRY(VuBlockDev) next;
69
+
69
+ struct virtio_blk_config blkcfg;
70
+ GHashTable *elem_by_devfn;
70
+ bool writable;
71
+
72
+ QemuMutex lock;
73
+};
71
+};
74
+
72
+
75
+void remote_iommu_setup(PCIBus *pci_bus);
73
+#endif /* VHOST_USER_BLK_SERVER_H */
76
+
74
diff --git a/block/export/vhost-user-blk-server.c b/block/export/vhost-user-blk-server.c
77
+void remote_iommu_unplug_dev(PCIDevice *pci_dev);
78
+
79
+#endif
80
diff --git a/hw/remote/iommu.c b/hw/remote/iommu.c
81
new file mode 100644
75
new file mode 100644
82
index XXXXXXX..XXXXXXX
76
index XXXXXXX..XXXXXXX
83
--- /dev/null
77
--- /dev/null
84
+++ b/hw/remote/iommu.c
78
+++ b/block/export/vhost-user-blk-server.c
85
@@ -XXX,XX +XXX,XX @@
79
@@ -XXX,XX +XXX,XX @@
86
+/**
80
+/*
87
+ * IOMMU for remote device
81
+ * Sharing QEMU block devices via vhost-user protocal
88
+ *
82
+ *
89
+ * Copyright © 2022 Oracle and/or its affiliates.
83
+ * Parts of the code based on nbd/server.c.
90
+ *
84
+ *
91
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
85
+ * Copyright (c) Coiby Xu <coiby.xu@gmail.com>.
92
+ * See the COPYING file in the top-level directory.
86
+ * Copyright (c) 2020 Red Hat, Inc.
87
+ *
88
+ * This work is licensed under the terms of the GNU GPL, version 2 or
89
+ * later. See the COPYING file in the top-level directory.
90
+ */
91
+#include "qemu/osdep.h"
92
+#include "block/block.h"
93
+#include "vhost-user-blk-server.h"
94
+#include "qapi/error.h"
95
+#include "qom/object_interfaces.h"
96
+#include "sysemu/block-backend.h"
97
+#include "util/block-helpers.h"
98
+
99
+enum {
100
+ VHOST_USER_BLK_MAX_QUEUES = 1,
101
+};
102
+struct virtio_blk_inhdr {
103
+ unsigned char status;
104
+};
105
+
106
+typedef struct VuBlockReq {
107
+ VuVirtqElement *elem;
108
+ int64_t sector_num;
109
+ size_t size;
110
+ struct virtio_blk_inhdr *in;
111
+ struct virtio_blk_outhdr out;
112
+ VuServer *server;
113
+ struct VuVirtq *vq;
114
+} VuBlockReq;
115
+
116
+static void vu_block_req_complete(VuBlockReq *req)
117
+{
118
+ VuDev *vu_dev = &req->server->vu_dev;
119
+
120
+ /* IO size with 1 extra status byte */
121
+ vu_queue_push(vu_dev, req->vq, req->elem, req->size + 1);
122
+ vu_queue_notify(vu_dev, req->vq);
123
+
124
+ if (req->elem) {
125
+ free(req->elem);
126
+ }
127
+
128
+ g_free(req);
129
+}
130
+
131
+static VuBlockDev *get_vu_block_device_by_server(VuServer *server)
132
+{
133
+ return container_of(server, VuBlockDev, vu_server);
134
+}
135
+
136
+static int coroutine_fn
137
+vu_block_discard_write_zeroes(VuBlockReq *req, struct iovec *iov,
138
+ uint32_t iovcnt, uint32_t type)
139
+{
140
+ struct virtio_blk_discard_write_zeroes desc;
141
+ ssize_t size = iov_to_buf(iov, iovcnt, 0, &desc, sizeof(desc));
142
+ if (unlikely(size != sizeof(desc))) {
143
+ error_report("Invalid size %zd, expect %zu", size, sizeof(desc));
144
+ return -EINVAL;
145
+ }
146
+
147
+ VuBlockDev *vdev_blk = get_vu_block_device_by_server(req->server);
148
+ uint64_t range[2] = { le64_to_cpu(desc.sector) << 9,
149
+ le32_to_cpu(desc.num_sectors) << 9 };
150
+ if (type == VIRTIO_BLK_T_DISCARD) {
151
+ if (blk_co_pdiscard(vdev_blk->backend, range[0], range[1]) == 0) {
152
+ return 0;
153
+ }
154
+ } else if (type == VIRTIO_BLK_T_WRITE_ZEROES) {
155
+ if (blk_co_pwrite_zeroes(vdev_blk->backend,
156
+ range[0], range[1], 0) == 0) {
157
+ return 0;
158
+ }
159
+ }
160
+
161
+ return -EINVAL;
162
+}
163
+
164
+static void coroutine_fn vu_block_flush(VuBlockReq *req)
165
+{
166
+ VuBlockDev *vdev_blk = get_vu_block_device_by_server(req->server);
167
+ BlockBackend *backend = vdev_blk->backend;
168
+ blk_co_flush(backend);
169
+}
170
+
171
+struct req_data {
172
+ VuServer *server;
173
+ VuVirtq *vq;
174
+ VuVirtqElement *elem;
175
+};
176
+
177
+static void coroutine_fn vu_block_virtio_process_req(void *opaque)
178
+{
179
+ struct req_data *data = opaque;
180
+ VuServer *server = data->server;
181
+ VuVirtq *vq = data->vq;
182
+ VuVirtqElement *elem = data->elem;
183
+ uint32_t type;
184
+ VuBlockReq *req;
185
+
186
+ VuBlockDev *vdev_blk = get_vu_block_device_by_server(server);
187
+ BlockBackend *backend = vdev_blk->backend;
188
+
189
+ struct iovec *in_iov = elem->in_sg;
190
+ struct iovec *out_iov = elem->out_sg;
191
+ unsigned in_num = elem->in_num;
192
+ unsigned out_num = elem->out_num;
193
+ /* refer to hw/block/virtio_blk.c */
194
+ if (elem->out_num < 1 || elem->in_num < 1) {
195
+ error_report("virtio-blk request missing headers");
196
+ free(elem);
197
+ return;
198
+ }
199
+
200
+ req = g_new0(VuBlockReq, 1);
201
+ req->server = server;
202
+ req->vq = vq;
203
+ req->elem = elem;
204
+
205
+ if (unlikely(iov_to_buf(out_iov, out_num, 0, &req->out,
206
+ sizeof(req->out)) != sizeof(req->out))) {
207
+ error_report("virtio-blk request outhdr too short");
208
+ goto err;
209
+ }
210
+
211
+ iov_discard_front(&out_iov, &out_num, sizeof(req->out));
212
+
213
+ if (in_iov[in_num - 1].iov_len < sizeof(struct virtio_blk_inhdr)) {
214
+ error_report("virtio-blk request inhdr too short");
215
+ goto err;
216
+ }
217
+
218
+ /* We always touch the last byte, so just see how big in_iov is. */
219
+ req->in = (void *)in_iov[in_num - 1].iov_base
220
+ + in_iov[in_num - 1].iov_len
221
+ - sizeof(struct virtio_blk_inhdr);
222
+ iov_discard_back(in_iov, &in_num, sizeof(struct virtio_blk_inhdr));
223
+
224
+ type = le32_to_cpu(req->out.type);
225
+ switch (type & ~VIRTIO_BLK_T_BARRIER) {
226
+ case VIRTIO_BLK_T_IN:
227
+ case VIRTIO_BLK_T_OUT: {
228
+ ssize_t ret = 0;
229
+ bool is_write = type & VIRTIO_BLK_T_OUT;
230
+ req->sector_num = le64_to_cpu(req->out.sector);
231
+
232
+ int64_t offset = req->sector_num * vdev_blk->blk_size;
233
+ QEMUIOVector qiov;
234
+ if (is_write) {
235
+ qemu_iovec_init_external(&qiov, out_iov, out_num);
236
+ ret = blk_co_pwritev(backend, offset, qiov.size,
237
+ &qiov, 0);
238
+ } else {
239
+ qemu_iovec_init_external(&qiov, in_iov, in_num);
240
+ ret = blk_co_preadv(backend, offset, qiov.size,
241
+ &qiov, 0);
242
+ }
243
+ if (ret >= 0) {
244
+ req->in->status = VIRTIO_BLK_S_OK;
245
+ } else {
246
+ req->in->status = VIRTIO_BLK_S_IOERR;
247
+ }
248
+ break;
249
+ }
250
+ case VIRTIO_BLK_T_FLUSH:
251
+ vu_block_flush(req);
252
+ req->in->status = VIRTIO_BLK_S_OK;
253
+ break;
254
+ case VIRTIO_BLK_T_GET_ID: {
255
+ size_t size = MIN(iov_size(&elem->in_sg[0], in_num),
256
+ VIRTIO_BLK_ID_BYTES);
257
+ snprintf(elem->in_sg[0].iov_base, size, "%s", "vhost_user_blk");
258
+ req->in->status = VIRTIO_BLK_S_OK;
259
+ req->size = elem->in_sg[0].iov_len;
260
+ break;
261
+ }
262
+ case VIRTIO_BLK_T_DISCARD:
263
+ case VIRTIO_BLK_T_WRITE_ZEROES: {
264
+ int rc;
265
+ rc = vu_block_discard_write_zeroes(req, &elem->out_sg[1],
266
+ out_num, type);
267
+ if (rc == 0) {
268
+ req->in->status = VIRTIO_BLK_S_OK;
269
+ } else {
270
+ req->in->status = VIRTIO_BLK_S_IOERR;
271
+ }
272
+ break;
273
+ }
274
+ default:
275
+ req->in->status = VIRTIO_BLK_S_UNSUPP;
276
+ break;
277
+ }
278
+
279
+ vu_block_req_complete(req);
280
+ return;
281
+
282
+err:
283
+ free(elem);
284
+ g_free(req);
285
+ return;
286
+}
287
+
288
+static void vu_block_process_vq(VuDev *vu_dev, int idx)
289
+{
290
+ VuServer *server;
291
+ VuVirtq *vq;
292
+ struct req_data *req_data;
293
+
294
+ server = container_of(vu_dev, VuServer, vu_dev);
295
+ assert(server);
296
+
297
+ vq = vu_get_queue(vu_dev, idx);
298
+ assert(vq);
299
+ VuVirtqElement *elem;
300
+ while (1) {
301
+ elem = vu_queue_pop(vu_dev, vq, sizeof(VuVirtqElement) +
302
+ sizeof(VuBlockReq));
303
+ if (elem) {
304
+ req_data = g_new0(struct req_data, 1);
305
+ req_data->server = server;
306
+ req_data->vq = vq;
307
+ req_data->elem = elem;
308
+ Coroutine *co = qemu_coroutine_create(vu_block_virtio_process_req,
309
+ req_data);
310
+ aio_co_enter(server->ioc->ctx, co);
311
+ } else {
312
+ break;
313
+ }
314
+ }
315
+}
316
+
317
+static void vu_block_queue_set_started(VuDev *vu_dev, int idx, bool started)
318
+{
319
+ VuVirtq *vq;
320
+
321
+ assert(vu_dev);
322
+
323
+ vq = vu_get_queue(vu_dev, idx);
324
+ vu_set_queue_handler(vu_dev, vq, started ? vu_block_process_vq : NULL);
325
+}
326
+
327
+static uint64_t vu_block_get_features(VuDev *dev)
328
+{
329
+ uint64_t features;
330
+ VuServer *server = container_of(dev, VuServer, vu_dev);
331
+ VuBlockDev *vdev_blk = get_vu_block_device_by_server(server);
332
+ features = 1ull << VIRTIO_BLK_F_SIZE_MAX |
333
+ 1ull << VIRTIO_BLK_F_SEG_MAX |
334
+ 1ull << VIRTIO_BLK_F_TOPOLOGY |
335
+ 1ull << VIRTIO_BLK_F_BLK_SIZE |
336
+ 1ull << VIRTIO_BLK_F_FLUSH |
337
+ 1ull << VIRTIO_BLK_F_DISCARD |
338
+ 1ull << VIRTIO_BLK_F_WRITE_ZEROES |
339
+ 1ull << VIRTIO_BLK_F_CONFIG_WCE |
340
+ 1ull << VIRTIO_F_VERSION_1 |
341
+ 1ull << VIRTIO_RING_F_INDIRECT_DESC |
342
+ 1ull << VIRTIO_RING_F_EVENT_IDX |
343
+ 1ull << VHOST_USER_F_PROTOCOL_FEATURES;
344
+
345
+ if (!vdev_blk->writable) {
346
+ features |= 1ull << VIRTIO_BLK_F_RO;
347
+ }
348
+
349
+ return features;
350
+}
351
+
352
+static uint64_t vu_block_get_protocol_features(VuDev *dev)
353
+{
354
+ return 1ull << VHOST_USER_PROTOCOL_F_CONFIG |
355
+ 1ull << VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD;
356
+}
357
+
358
+static int
359
+vu_block_get_config(VuDev *vu_dev, uint8_t *config, uint32_t len)
360
+{
361
+ VuServer *server = container_of(vu_dev, VuServer, vu_dev);
362
+ VuBlockDev *vdev_blk = get_vu_block_device_by_server(server);
363
+ memcpy(config, &vdev_blk->blkcfg, len);
364
+
365
+ return 0;
366
+}
367
+
368
+static int
369
+vu_block_set_config(VuDev *vu_dev, const uint8_t *data,
370
+ uint32_t offset, uint32_t size, uint32_t flags)
371
+{
372
+ VuServer *server = container_of(vu_dev, VuServer, vu_dev);
373
+ VuBlockDev *vdev_blk = get_vu_block_device_by_server(server);
374
+ uint8_t wce;
375
+
376
+ /* don't support live migration */
377
+ if (flags != VHOST_SET_CONFIG_TYPE_MASTER) {
378
+ return -EINVAL;
379
+ }
380
+
381
+ if (offset != offsetof(struct virtio_blk_config, wce) ||
382
+ size != 1) {
383
+ return -EINVAL;
384
+ }
385
+
386
+ wce = *data;
387
+ vdev_blk->blkcfg.wce = wce;
388
+ blk_set_enable_write_cache(vdev_blk->backend, wce);
389
+ return 0;
390
+}
391
+
392
+/*
393
+ * When the client disconnects, it sends a VHOST_USER_NONE request
394
+ * and vu_process_message will simple call exit which cause the VM
395
+ * to exit abruptly.
396
+ * To avoid this issue, process VHOST_USER_NONE request ahead
397
+ * of vu_process_message.
93
+ *
398
+ *
94
+ */
399
+ */
95
+
400
+static int vu_block_process_msg(VuDev *dev, VhostUserMsg *vmsg, int *do_reply)
96
+#include "qemu/osdep.h"
401
+{
97
+
402
+ if (vmsg->request == VHOST_USER_NONE) {
98
+#include "hw/remote/iommu.h"
403
+ dev->panic(dev, "disconnect");
99
+#include "hw/pci/pci_bus.h"
404
+ return true;
100
+#include "hw/pci/pci.h"
405
+ }
101
+#include "exec/memory.h"
406
+ return false;
102
+#include "exec/address-spaces.h"
407
+}
103
+#include "trace.h"
408
+
104
+
409
+static const VuDevIface vu_block_iface = {
105
+/**
410
+ .get_features = vu_block_get_features,
106
+ * IOMMU for TYPE_REMOTE_MACHINE - manages DMA address space isolation
411
+ .queue_set_started = vu_block_queue_set_started,
107
+ * for remote machine. It is used by TYPE_VFIO_USER_SERVER.
412
+ .get_protocol_features = vu_block_get_protocol_features,
108
+ *
413
+ .get_config = vu_block_get_config,
109
+ * - Each TYPE_VFIO_USER_SERVER instance handles one PCIDevice on a PCIBus.
414
+ .set_config = vu_block_set_config,
110
+ * There is one RemoteIommu per PCIBus, so the RemoteIommu tracks multiple
415
+ .process_msg = vu_block_process_msg,
111
+ * PCIDevices by maintaining a ->elem_by_devfn mapping.
416
+};
112
+ *
417
+
113
+ * - memory_region_init_iommu() is not used because vfio-user MemoryRegions
418
+static void blk_aio_attached(AioContext *ctx, void *opaque)
114
+ * will be added to the elem->mr container instead. This is more natural
419
+{
115
+ * than implementing the IOMMUMemoryRegionClass APIs since vfio-user
420
+ VuBlockDev *vub_dev = opaque;
116
+ * provides something that is close to a full-fledged MemoryRegion and
421
+ aio_context_acquire(ctx);
117
+ * not like an IOMMU mapping.
422
+ vhost_user_server_set_aio_context(&vub_dev->vu_server, ctx);
118
+ *
423
+ aio_context_release(ctx);
119
+ * - When a device is hot unplugged, the elem->mr reference is dropped so
424
+}
120
+ * all vfio-user MemoryRegions associated with this vfio-user server are
425
+
121
+ * destroyed.
426
+static void blk_aio_detach(void *opaque)
122
+ */
427
+{
123
+
428
+ VuBlockDev *vub_dev = opaque;
124
+static AddressSpace *remote_iommu_find_add_as(PCIBus *pci_bus,
429
+ AioContext *ctx = vub_dev->vu_server.ctx;
125
+ void *opaque, int devfn)
430
+ aio_context_acquire(ctx);
126
+{
431
+ vhost_user_server_set_aio_context(&vub_dev->vu_server, NULL);
127
+ RemoteIommu *iommu = opaque;
432
+ aio_context_release(ctx);
128
+ RemoteIommuElem *elem = NULL;
433
+}
129
+
434
+
130
+ qemu_mutex_lock(&iommu->lock);
435
+static void
131
+
436
+vu_block_initialize_config(BlockDriverState *bs,
132
+ elem = g_hash_table_lookup(iommu->elem_by_devfn, INT2VOIDP(devfn));
437
+ struct virtio_blk_config *config, uint32_t blk_size)
133
+
438
+{
134
+ if (!elem) {
439
+ config->capacity = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
135
+ elem = g_malloc0(sizeof(RemoteIommuElem));
440
+ config->blk_size = blk_size;
136
+ g_hash_table_insert(iommu->elem_by_devfn, INT2VOIDP(devfn), elem);
441
+ config->size_max = 0;
137
+ }
442
+ config->seg_max = 128 - 2;
138
+
443
+ config->min_io_size = 1;
139
+ if (!elem->mr) {
444
+ config->opt_io_size = 1;
140
+ elem->mr = MEMORY_REGION(object_new(TYPE_MEMORY_REGION));
445
+ config->num_queues = VHOST_USER_BLK_MAX_QUEUES;
141
+ memory_region_set_size(elem->mr, UINT64_MAX);
446
+ config->max_discard_sectors = 32768;
142
+ address_space_init(&elem->as, elem->mr, NULL);
447
+ config->max_discard_seg = 1;
143
+ }
448
+ config->discard_sector_alignment = config->blk_size >> 9;
144
+
449
+ config->max_write_zeroes_sectors = 32768;
145
+ qemu_mutex_unlock(&iommu->lock);
450
+ config->max_write_zeroes_seg = 1;
146
+
451
+}
147
+ return &elem->as;
452
+
148
+}
453
+static VuBlockDev *vu_block_init(VuBlockDev *vu_block_device, Error **errp)
149
+
454
+{
150
+void remote_iommu_unplug_dev(PCIDevice *pci_dev)
455
+
151
+{
456
+ BlockBackend *blk;
152
+ AddressSpace *as = pci_device_iommu_address_space(pci_dev);
457
+ Error *local_error = NULL;
153
+ RemoteIommuElem *elem = NULL;
458
+ const char *node_name = vu_block_device->node_name;
154
+
459
+ bool writable = vu_block_device->writable;
155
+ if (as == &address_space_memory) {
460
+ uint64_t perm = BLK_PERM_CONSISTENT_READ;
461
+ int ret;
462
+
463
+ AioContext *ctx;
464
+
465
+ BlockDriverState *bs = bdrv_lookup_bs(node_name, node_name, &local_error);
466
+
467
+ if (!bs) {
468
+ error_propagate(errp, local_error);
469
+ return NULL;
470
+ }
471
+
472
+ if (bdrv_is_read_only(bs)) {
473
+ writable = false;
474
+ }
475
+
476
+ if (writable) {
477
+ perm |= BLK_PERM_WRITE;
478
+ }
479
+
480
+ ctx = bdrv_get_aio_context(bs);
481
+ aio_context_acquire(ctx);
482
+ bdrv_invalidate_cache(bs, NULL);
483
+ aio_context_release(ctx);
484
+
485
+ /*
486
+ * Don't allow resize while the vhost user server is running,
487
+ * otherwise we don't care what happens with the node.
488
+ */
489
+ blk = blk_new(bdrv_get_aio_context(bs), perm,
490
+ BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE_UNCHANGED |
491
+ BLK_PERM_WRITE | BLK_PERM_GRAPH_MOD);
492
+ ret = blk_insert_bs(blk, bs, errp);
493
+
494
+ if (ret < 0) {
495
+ goto fail;
496
+ }
497
+
498
+ blk_set_enable_write_cache(blk, false);
499
+
500
+ blk_set_allow_aio_context_change(blk, true);
501
+
502
+ vu_block_device->blkcfg.wce = 0;
503
+ vu_block_device->backend = blk;
504
+ if (!vu_block_device->blk_size) {
505
+ vu_block_device->blk_size = BDRV_SECTOR_SIZE;
506
+ }
507
+ vu_block_device->blkcfg.blk_size = vu_block_device->blk_size;
508
+ blk_set_guest_block_size(blk, vu_block_device->blk_size);
509
+ vu_block_initialize_config(bs, &vu_block_device->blkcfg,
510
+ vu_block_device->blk_size);
511
+ return vu_block_device;
512
+
513
+fail:
514
+ blk_unref(blk);
515
+ return NULL;
516
+}
517
+
518
+static void vu_block_deinit(VuBlockDev *vu_block_device)
519
+{
520
+ if (vu_block_device->backend) {
521
+ blk_remove_aio_context_notifier(vu_block_device->backend, blk_aio_attached,
522
+ blk_aio_detach, vu_block_device);
523
+ }
524
+
525
+ blk_unref(vu_block_device->backend);
526
+}
527
+
528
+static void vhost_user_blk_server_stop(VuBlockDev *vu_block_device)
529
+{
530
+ vhost_user_server_stop(&vu_block_device->vu_server);
531
+ vu_block_deinit(vu_block_device);
532
+}
533
+
534
+static void vhost_user_blk_server_start(VuBlockDev *vu_block_device,
535
+ Error **errp)
536
+{
537
+ AioContext *ctx;
538
+ SocketAddress *addr = vu_block_device->addr;
539
+
540
+ if (!vu_block_init(vu_block_device, errp)) {
156
+ return;
541
+ return;
157
+ }
542
+ }
158
+
543
+
159
+ elem = container_of(as, RemoteIommuElem, as);
544
+ ctx = bdrv_get_aio_context(blk_bs(vu_block_device->backend));
160
+
545
+
161
+ address_space_destroy(&elem->as);
546
+ if (!vhost_user_server_start(&vu_block_device->vu_server, addr, ctx,
162
+
547
+ VHOST_USER_BLK_MAX_QUEUES,
163
+ object_unref(elem->mr);
548
+ NULL, &vu_block_iface,
164
+
549
+ errp)) {
165
+ elem->mr = NULL;
550
+ goto error;
166
+}
551
+ }
167
+
552
+
168
+static void remote_iommu_init(Object *obj)
553
+ blk_add_aio_context_notifier(vu_block_device->backend, blk_aio_attached,
169
+{
554
+ blk_aio_detach, vu_block_device);
170
+ RemoteIommu *iommu = REMOTE_IOMMU(obj);
555
+ vu_block_device->running = true;
171
+
556
+ return;
172
+ iommu->elem_by_devfn = g_hash_table_new_full(NULL, NULL, NULL, g_free);
557
+
173
+
558
+ error:
174
+ qemu_mutex_init(&iommu->lock);
559
+ vu_block_deinit(vu_block_device);
175
+}
560
+}
176
+
561
+
177
+static void remote_iommu_finalize(Object *obj)
562
+static bool vu_prop_modifiable(VuBlockDev *vus, Error **errp)
178
+{
563
+{
179
+ RemoteIommu *iommu = REMOTE_IOMMU(obj);
564
+ if (vus->running) {
180
+
565
+ error_setg(errp, "The property can't be modified "
181
+ qemu_mutex_destroy(&iommu->lock);
566
+ "while the server is running");
182
+
567
+ return false;
183
+ g_hash_table_destroy(iommu->elem_by_devfn);
568
+ }
184
+
569
+ return true;
185
+ iommu->elem_by_devfn = NULL;
570
+}
186
+}
571
+
187
+
572
+static void vu_set_node_name(Object *obj, const char *value, Error **errp)
188
+void remote_iommu_setup(PCIBus *pci_bus)
573
+{
189
+{
574
+ VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
190
+ RemoteIommu *iommu = NULL;
575
+
191
+
576
+ if (!vu_prop_modifiable(vus, errp)) {
192
+ g_assert(pci_bus);
577
+ return;
193
+
578
+ }
194
+ iommu = REMOTE_IOMMU(object_new(TYPE_REMOTE_IOMMU));
579
+
195
+
580
+ if (vus->node_name) {
196
+ pci_setup_iommu(pci_bus, remote_iommu_find_add_as, iommu);
581
+ g_free(vus->node_name);
197
+
582
+ }
198
+ object_property_add_child(OBJECT(pci_bus), "remote-iommu", OBJECT(iommu));
583
+
199
+
584
+ vus->node_name = g_strdup(value);
200
+ object_unref(OBJECT(iommu));
585
+}
201
+}
586
+
202
+
587
+static char *vu_get_node_name(Object *obj, Error **errp)
203
+static const TypeInfo remote_iommu_info = {
588
+{
204
+ .name = TYPE_REMOTE_IOMMU,
589
+ VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
590
+ return g_strdup(vus->node_name);
591
+}
592
+
593
+static void free_socket_addr(SocketAddress *addr)
594
+{
595
+ g_free(addr->u.q_unix.path);
596
+ g_free(addr);
597
+}
598
+
599
+static void vu_set_unix_socket(Object *obj, const char *value,
600
+ Error **errp)
601
+{
602
+ VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
603
+
604
+ if (!vu_prop_modifiable(vus, errp)) {
605
+ return;
606
+ }
607
+
608
+ if (vus->addr) {
609
+ free_socket_addr(vus->addr);
610
+ }
611
+
612
+ SocketAddress *addr = g_new0(SocketAddress, 1);
613
+ addr->type = SOCKET_ADDRESS_TYPE_UNIX;
614
+ addr->u.q_unix.path = g_strdup(value);
615
+ vus->addr = addr;
616
+}
617
+
618
+static char *vu_get_unix_socket(Object *obj, Error **errp)
619
+{
620
+ VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
621
+ return g_strdup(vus->addr->u.q_unix.path);
622
+}
623
+
624
+static bool vu_get_block_writable(Object *obj, Error **errp)
625
+{
626
+ VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
627
+ return vus->writable;
628
+}
629
+
630
+static void vu_set_block_writable(Object *obj, bool value, Error **errp)
631
+{
632
+ VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
633
+
634
+ if (!vu_prop_modifiable(vus, errp)) {
635
+ return;
636
+ }
637
+
638
+ vus->writable = value;
639
+}
640
+
641
+static void vu_get_blk_size(Object *obj, Visitor *v, const char *name,
642
+ void *opaque, Error **errp)
643
+{
644
+ VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
645
+ uint32_t value = vus->blk_size;
646
+
647
+ visit_type_uint32(v, name, &value, errp);
648
+}
649
+
650
+static void vu_set_blk_size(Object *obj, Visitor *v, const char *name,
651
+ void *opaque, Error **errp)
652
+{
653
+ VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
654
+
655
+ Error *local_err = NULL;
656
+ uint32_t value;
657
+
658
+ if (!vu_prop_modifiable(vus, errp)) {
659
+ return;
660
+ }
661
+
662
+ visit_type_uint32(v, name, &value, &local_err);
663
+ if (local_err) {
664
+ goto out;
665
+ }
666
+
667
+ check_block_size(object_get_typename(obj), name, value, &local_err);
668
+ if (local_err) {
669
+ goto out;
670
+ }
671
+
672
+ vus->blk_size = value;
673
+
674
+out:
675
+ error_propagate(errp, local_err);
676
+}
677
+
678
+static void vhost_user_blk_server_instance_finalize(Object *obj)
679
+{
680
+ VuBlockDev *vub = VHOST_USER_BLK_SERVER(obj);
681
+
682
+ vhost_user_blk_server_stop(vub);
683
+
684
+ /*
685
+ * Unlike object_property_add_str, object_class_property_add_str
686
+ * doesn't have a release method. Thus manual memory freeing is
687
+ * needed.
688
+ */
689
+ free_socket_addr(vub->addr);
690
+ g_free(vub->node_name);
691
+}
692
+
693
+static void vhost_user_blk_server_complete(UserCreatable *obj, Error **errp)
694
+{
695
+ VuBlockDev *vub = VHOST_USER_BLK_SERVER(obj);
696
+
697
+ vhost_user_blk_server_start(vub, errp);
698
+}
699
+
700
+static void vhost_user_blk_server_class_init(ObjectClass *klass,
701
+ void *class_data)
702
+{
703
+ UserCreatableClass *ucc = USER_CREATABLE_CLASS(klass);
704
+ ucc->complete = vhost_user_blk_server_complete;
705
+
706
+ object_class_property_add_bool(klass, "writable",
707
+ vu_get_block_writable,
708
+ vu_set_block_writable);
709
+
710
+ object_class_property_add_str(klass, "node-name",
711
+ vu_get_node_name,
712
+ vu_set_node_name);
713
+
714
+ object_class_property_add_str(klass, "unix-socket",
715
+ vu_get_unix_socket,
716
+ vu_set_unix_socket);
717
+
718
+ object_class_property_add(klass, "logical-block-size", "uint32",
719
+ vu_get_blk_size, vu_set_blk_size,
720
+ NULL, NULL);
721
+}
722
+
723
+static const TypeInfo vhost_user_blk_server_info = {
724
+ .name = TYPE_VHOST_USER_BLK_SERVER,
205
+ .parent = TYPE_OBJECT,
725
+ .parent = TYPE_OBJECT,
206
+ .instance_size = sizeof(RemoteIommu),
726
+ .instance_size = sizeof(VuBlockDev),
207
+ .instance_init = remote_iommu_init,
727
+ .instance_finalize = vhost_user_blk_server_instance_finalize,
208
+ .instance_finalize = remote_iommu_finalize,
728
+ .class_init = vhost_user_blk_server_class_init,
729
+ .interfaces = (InterfaceInfo[]) {
730
+ {TYPE_USER_CREATABLE},
731
+ {}
732
+ },
209
+};
733
+};
210
+
734
+
211
+static void remote_iommu_register_types(void)
735
+static void vhost_user_blk_server_register_types(void)
212
+{
736
+{
213
+ type_register_static(&remote_iommu_info);
737
+ type_register_static(&vhost_user_blk_server_info);
214
+}
738
+}
215
+
739
+
216
+type_init(remote_iommu_register_types)
740
+type_init(vhost_user_blk_server_register_types)
217
diff --git a/hw/remote/machine.c b/hw/remote/machine.c
741
diff --git a/softmmu/vl.c b/softmmu/vl.c
218
index XXXXXXX..XXXXXXX 100644
742
index XXXXXXX..XXXXXXX 100644
219
--- a/hw/remote/machine.c
743
--- a/softmmu/vl.c
220
+++ b/hw/remote/machine.c
744
+++ b/softmmu/vl.c
221
@@ -XXX,XX +XXX,XX @@
745
@@ -XXX,XX +XXX,XX @@ static bool object_create_initial(const char *type, QemuOpts *opts)
222
#include "qapi/error.h"
746
}
223
#include "hw/pci/pci_host.h"
747
#endif
224
#include "hw/remote/iohub.h"
748
225
+#include "hw/remote/iommu.h"
749
+ /* Reason: vhost-user-blk-server property "node-name" */
226
#include "hw/qdev-core.h"
750
+ if (g_str_equal(type, "vhost-user-blk-server")) {
227
751
+ return false;
228
static void remote_machine_init(MachineState *machine)
752
+ }
229
@@ -XXX,XX +XXX,XX @@ static void remote_machine_instance_init(Object *obj)
753
/*
230
s->auto_shutdown = true;
754
* Reason: filter-* property "netdev" etc.
231
}
755
*/
232
756
diff --git a/block/meson.build b/block/meson.build
233
+static void remote_machine_dev_unplug_cb(HotplugHandler *hotplug_dev,
234
+ DeviceState *dev, Error **errp)
235
+{
236
+ qdev_unrealize(dev);
237
+
238
+ if (object_dynamic_cast(OBJECT(dev), TYPE_PCI_DEVICE)) {
239
+ remote_iommu_unplug_dev(PCI_DEVICE(dev));
240
+ }
241
+}
242
+
243
static void remote_machine_class_init(ObjectClass *oc, void *data)
244
{
245
MachineClass *mc = MACHINE_CLASS(oc);
246
@@ -XXX,XX +XXX,XX @@ static void remote_machine_class_init(ObjectClass *oc, void *data)
247
mc->init = remote_machine_init;
248
mc->desc = "Experimental remote machine";
249
250
- hc->unplug = qdev_simple_device_unplug_cb;
251
+ hc->unplug = remote_machine_dev_unplug_cb;
252
253
object_class_property_add_bool(oc, "vfio-user",
254
remote_machine_get_vfio_user,
255
diff --git a/hw/remote/meson.build b/hw/remote/meson.build
256
index XXXXXXX..XXXXXXX 100644
757
index XXXXXXX..XXXXXXX 100644
257
--- a/hw/remote/meson.build
758
--- a/block/meson.build
258
+++ b/hw/remote/meson.build
759
+++ b/block/meson.build
259
@@ -XXX,XX +XXX,XX @@ remote_ss.add(when: 'CONFIG_MULTIPROCESS', if_true: files('message.c'))
760
@@ -XXX,XX +XXX,XX @@ block_ss.add(when: 'CONFIG_WIN32', if_true: files('file-win32.c', 'win32-aio.c')
260
remote_ss.add(when: 'CONFIG_MULTIPROCESS', if_true: files('remote-obj.c'))
761
block_ss.add(when: 'CONFIG_POSIX', if_true: [files('file-posix.c'), coref, iokit])
261
remote_ss.add(when: 'CONFIG_MULTIPROCESS', if_true: files('proxy.c'))
762
block_ss.add(when: 'CONFIG_LIBISCSI', if_true: files('iscsi-opts.c'))
262
remote_ss.add(when: 'CONFIG_MULTIPROCESS', if_true: files('iohub.c'))
763
block_ss.add(when: 'CONFIG_LINUX', if_true: files('nvme.c'))
263
+remote_ss.add(when: 'CONFIG_MULTIPROCESS', if_true: files('iommu.c'))
764
+block_ss.add(when: 'CONFIG_LINUX', if_true: files('export/vhost-user-blk-server.c', '../contrib/libvhost-user/libvhost-user.c'))
264
remote_ss.add(when: 'CONFIG_VFIO_USER_SERVER', if_true: files('vfio-user-obj.c'))
765
block_ss.add(when: 'CONFIG_REPLICATION', if_true: files('replication.c'))
265
766
block_ss.add(when: 'CONFIG_SHEEPDOG', if_true: files('sheepdog.c'))
266
remote_ss.add(when: 'CONFIG_VFIO_USER_SERVER', if_true: libvfio_user_dep)
767
block_ss.add(when: ['CONFIG_LINUX_AIO', libaio], if_true: files('linux-aio.c'))
267
--
768
--
268
2.36.1
769
2.26.2
269
770
270
diff view generated by jsdifflib
New patch
1
From: Coiby Xu <coiby.xu@gmail.com>
1
2
3
Suggested-by: Stefano Garzarella <sgarzare@redhat.com>
4
Signed-off-by: Coiby Xu <coiby.xu@gmail.com>
5
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
6
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
7
Message-id: 20200918080912.321299-8-coiby.xu@gmail.com
8
[Removed reference to vhost-user-blk-test.c, it will be sent in a
9
separate pull request.
10
--Stefan]
11
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
12
---
13
MAINTAINERS | 7 +++++++
14
1 file changed, 7 insertions(+)
15
16
diff --git a/MAINTAINERS b/MAINTAINERS
17
index XXXXXXX..XXXXXXX 100644
18
--- a/MAINTAINERS
19
+++ b/MAINTAINERS
20
@@ -XXX,XX +XXX,XX @@ L: qemu-block@nongnu.org
21
S: Supported
22
F: tests/image-fuzzer/
23
24
+Vhost-user block device backend server
25
+M: Coiby Xu <Coiby.Xu@gmail.com>
26
+S: Maintained
27
+F: block/export/vhost-user-blk-server.c
28
+F: util/vhost-user-server.c
29
+F: tests/qtest/libqos/vhost-user-blk.c
30
+
31
Replication
32
M: Wen Congyang <wencongyang2@huawei.com>
33
M: Xie Changlong <xiechanglong.d@gmail.com>
34
--
35
2.26.2
36
diff view generated by jsdifflib
New patch
1
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2
Message-id: 20200924151549.913737-3-stefanha@redhat.com
3
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
4
---
5
util/vhost-user-server.c | 2 +-
6
1 file changed, 1 insertion(+), 1 deletion(-)
1
7
8
diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
9
index XXXXXXX..XXXXXXX 100644
10
--- a/util/vhost-user-server.c
11
+++ b/util/vhost-user-server.c
12
@@ -XXX,XX +XXX,XX @@ bool vhost_user_server_start(VuServer *server,
13
return false;
14
}
15
16
- /* zero out unspecified fileds */
17
+ /* zero out unspecified fields */
18
*server = (VuServer) {
19
.listener = listener,
20
.vu_iface = vu_iface,
21
--
22
2.26.2
23
diff view generated by jsdifflib
New patch
1
We already have access to the value with the correct type (ioc and sioc
2
are the same QIOChannel).
1
3
4
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
5
Message-id: 20200924151549.913737-4-stefanha@redhat.com
6
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
7
---
8
util/vhost-user-server.c | 2 +-
9
1 file changed, 1 insertion(+), 1 deletion(-)
10
11
diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
12
index XXXXXXX..XXXXXXX 100644
13
--- a/util/vhost-user-server.c
14
+++ b/util/vhost-user-server.c
15
@@ -XXX,XX +XXX,XX @@ static void vu_accept(QIONetListener *listener, QIOChannelSocket *sioc,
16
server->ioc = QIO_CHANNEL(sioc);
17
object_ref(OBJECT(server->ioc));
18
qio_channel_attach_aio_context(server->ioc, server->ctx);
19
- qio_channel_set_blocking(QIO_CHANNEL(server->sioc), false, NULL);
20
+ qio_channel_set_blocking(server->ioc, false, NULL);
21
vu_client_start(server);
22
}
23
24
--
25
2.26.2
26
diff view generated by jsdifflib
New patch
1
Explicitly deleting watches is not necessary since libvhost-user calls
2
remove_watch() during vu_deinit(). Add an assertion to check this
3
though.
1
4
5
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
6
Message-id: 20200924151549.913737-5-stefanha@redhat.com
7
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
8
---
9
util/vhost-user-server.c | 19 ++++---------------
10
1 file changed, 4 insertions(+), 15 deletions(-)
11
12
diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/util/vhost-user-server.c
15
+++ b/util/vhost-user-server.c
16
@@ -XXX,XX +XXX,XX @@ static void close_client(VuServer *server)
17
/* When this is set vu_client_trip will stop new processing vhost-user message */
18
server->sioc = NULL;
19
20
- VuFdWatch *vu_fd_watch, *next;
21
- QTAILQ_FOREACH_SAFE(vu_fd_watch, &server->vu_fd_watches, next, next) {
22
- aio_set_fd_handler(server->ioc->ctx, vu_fd_watch->fd, true, NULL,
23
- NULL, NULL, NULL);
24
- }
25
-
26
- while (!QTAILQ_EMPTY(&server->vu_fd_watches)) {
27
- QTAILQ_FOREACH_SAFE(vu_fd_watch, &server->vu_fd_watches, next, next) {
28
- if (!vu_fd_watch->processing) {
29
- QTAILQ_REMOVE(&server->vu_fd_watches, vu_fd_watch, next);
30
- g_free(vu_fd_watch);
31
- }
32
- }
33
- }
34
-
35
while (server->processing_msg) {
36
if (server->ioc->read_coroutine) {
37
server->ioc->read_coroutine = NULL;
38
@@ -XXX,XX +XXX,XX @@ static void close_client(VuServer *server)
39
}
40
41
vu_deinit(&server->vu_dev);
42
+
43
+ /* vu_deinit() should have called remove_watch() */
44
+ assert(QTAILQ_EMPTY(&server->vu_fd_watches));
45
+
46
object_unref(OBJECT(sioc));
47
object_unref(OBJECT(server->ioc));
48
}
49
--
50
2.26.2
51
diff view generated by jsdifflib
1
From: Jagannathan Raman <jag.raman@oracle.com>
1
Only one struct is needed per request. Drop req_data and the separate
2
VuBlockReq instance. Instead let vu_queue_pop() allocate everything at
3
once.
2
4
3
Allow hotplugging of PCI(e) devices to remote machine
5
This fixes the req_data memory leak in vu_block_virtio_process_req().
4
6
5
Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
7
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
6
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
8
Message-id: 20200924151549.913737-6-stefanha@redhat.com
7
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
8
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
9
Message-id: d1e6cfa0afb528ad343758f9b1d918be0175c5e5.1655151679.git.jag.raman@oracle.com
10
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
9
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
11
---
10
---
12
hw/remote/machine.c | 10 ++++++++++
11
block/export/vhost-user-blk-server.c | 68 +++++++++-------------------
13
1 file changed, 10 insertions(+)
12
1 file changed, 21 insertions(+), 47 deletions(-)
14
13
15
diff --git a/hw/remote/machine.c b/hw/remote/machine.c
14
diff --git a/block/export/vhost-user-blk-server.c b/block/export/vhost-user-blk-server.c
16
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
17
--- a/hw/remote/machine.c
16
--- a/block/export/vhost-user-blk-server.c
18
+++ b/hw/remote/machine.c
17
+++ b/block/export/vhost-user-blk-server.c
19
@@ -XXX,XX +XXX,XX @@
18
@@ -XXX,XX +XXX,XX @@ struct virtio_blk_inhdr {
20
#include "qapi/error.h"
19
};
21
#include "hw/pci/pci_host.h"
20
22
#include "hw/remote/iohub.h"
21
typedef struct VuBlockReq {
23
+#include "hw/qdev-core.h"
22
- VuVirtqElement *elem;
24
23
+ VuVirtqElement elem;
25
static void remote_machine_init(MachineState *machine)
24
int64_t sector_num;
25
size_t size;
26
struct virtio_blk_inhdr *in;
27
@@ -XXX,XX +XXX,XX @@ static void vu_block_req_complete(VuBlockReq *req)
28
VuDev *vu_dev = &req->server->vu_dev;
29
30
/* IO size with 1 extra status byte */
31
- vu_queue_push(vu_dev, req->vq, req->elem, req->size + 1);
32
+ vu_queue_push(vu_dev, req->vq, &req->elem, req->size + 1);
33
vu_queue_notify(vu_dev, req->vq);
34
35
- if (req->elem) {
36
- free(req->elem);
37
- }
38
-
39
- g_free(req);
40
+ free(req);
41
}
42
43
static VuBlockDev *get_vu_block_device_by_server(VuServer *server)
44
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn vu_block_flush(VuBlockReq *req)
45
blk_co_flush(backend);
46
}
47
48
-struct req_data {
49
- VuServer *server;
50
- VuVirtq *vq;
51
- VuVirtqElement *elem;
52
-};
53
-
54
static void coroutine_fn vu_block_virtio_process_req(void *opaque)
26
{
55
{
27
@@ -XXX,XX +XXX,XX @@ static void remote_machine_init(MachineState *machine)
56
- struct req_data *data = opaque;
28
57
- VuServer *server = data->server;
29
pci_bus_irqs(pci_host->bus, remote_iohub_set_irq, remote_iohub_map_irq,
58
- VuVirtq *vq = data->vq;
30
&s->iohub, REMOTE_IOHUB_NB_PIRQS);
59
- VuVirtqElement *elem = data->elem;
60
+ VuBlockReq *req = opaque;
61
+ VuServer *server = req->server;
62
+ VuVirtqElement *elem = &req->elem;
63
uint32_t type;
64
- VuBlockReq *req;
65
66
VuBlockDev *vdev_blk = get_vu_block_device_by_server(server);
67
BlockBackend *backend = vdev_blk->backend;
68
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn vu_block_virtio_process_req(void *opaque)
69
struct iovec *out_iov = elem->out_sg;
70
unsigned in_num = elem->in_num;
71
unsigned out_num = elem->out_num;
31
+
72
+
32
+ qbus_set_hotplug_handler(BUS(pci_host->bus), OBJECT(s));
73
/* refer to hw/block/virtio_blk.c */
74
if (elem->out_num < 1 || elem->in_num < 1) {
75
error_report("virtio-blk request missing headers");
76
- free(elem);
77
- return;
78
+ goto err;
79
}
80
81
- req = g_new0(VuBlockReq, 1);
82
- req->server = server;
83
- req->vq = vq;
84
- req->elem = elem;
85
-
86
if (unlikely(iov_to_buf(out_iov, out_num, 0, &req->out,
87
sizeof(req->out)) != sizeof(req->out))) {
88
error_report("virtio-blk request outhdr too short");
89
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn vu_block_virtio_process_req(void *opaque)
90
91
err:
92
free(elem);
93
- g_free(req);
94
- return;
33
}
95
}
34
96
35
static void remote_machine_class_init(ObjectClass *oc, void *data)
97
static void vu_block_process_vq(VuDev *vu_dev, int idx)
36
{
98
{
37
MachineClass *mc = MACHINE_CLASS(oc);
99
- VuServer *server;
38
+ HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(oc);
100
- VuVirtq *vq;
39
101
- struct req_data *req_data;
40
mc->init = remote_machine_init;
102
+ VuServer *server = container_of(vu_dev, VuServer, vu_dev);
41
mc->desc = "Experimental remote machine";
103
+ VuVirtq *vq = vu_get_queue(vu_dev, idx);
104
105
- server = container_of(vu_dev, VuServer, vu_dev);
106
- assert(server);
107
-
108
- vq = vu_get_queue(vu_dev, idx);
109
- assert(vq);
110
- VuVirtqElement *elem;
111
while (1) {
112
- elem = vu_queue_pop(vu_dev, vq, sizeof(VuVirtqElement) +
113
- sizeof(VuBlockReq));
114
- if (elem) {
115
- req_data = g_new0(struct req_data, 1);
116
- req_data->server = server;
117
- req_data->vq = vq;
118
- req_data->elem = elem;
119
- Coroutine *co = qemu_coroutine_create(vu_block_virtio_process_req,
120
- req_data);
121
- aio_co_enter(server->ioc->ctx, co);
122
- } else {
123
+ VuBlockReq *req;
42
+
124
+
43
+ hc->unplug = qdev_simple_device_unplug_cb;
125
+ req = vu_queue_pop(vu_dev, vq, sizeof(VuBlockReq));
126
+ if (!req) {
127
break;
128
}
129
+
130
+ req->server = server;
131
+ req->vq = vq;
132
+
133
+ Coroutine *co =
134
+ qemu_coroutine_create(vu_block_virtio_process_req, req);
135
+ qemu_coroutine_enter(co);
136
}
44
}
137
}
45
138
46
static const TypeInfo remote_machine = {
47
@@ -XXX,XX +XXX,XX @@ static const TypeInfo remote_machine = {
48
.parent = TYPE_MACHINE,
49
.instance_size = sizeof(RemoteMachineState),
50
.class_init = remote_machine_class_init,
51
+ .interfaces = (InterfaceInfo[]) {
52
+ { TYPE_HOTPLUG_HANDLER },
53
+ { }
54
+ }
55
};
56
57
static void remote_machine_register_types(void)
58
--
139
--
59
2.36.1
140
2.26.2
141
diff view generated by jsdifflib
New patch
1
The device panic notifier callback is not used. Drop it.
1
2
3
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
4
Message-id: 20200924151549.913737-7-stefanha@redhat.com
5
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
6
---
7
util/vhost-user-server.h | 3 ---
8
block/export/vhost-user-blk-server.c | 3 +--
9
util/vhost-user-server.c | 6 ------
10
3 files changed, 1 insertion(+), 11 deletions(-)
11
12
diff --git a/util/vhost-user-server.h b/util/vhost-user-server.h
13
index XXXXXXX..XXXXXXX 100644
14
--- a/util/vhost-user-server.h
15
+++ b/util/vhost-user-server.h
16
@@ -XXX,XX +XXX,XX @@ typedef struct VuFdWatch {
17
} VuFdWatch;
18
19
typedef struct VuServer VuServer;
20
-typedef void DevicePanicNotifierFn(VuServer *server);
21
22
struct VuServer {
23
QIONetListener *listener;
24
AioContext *ctx;
25
- DevicePanicNotifierFn *device_panic_notifier;
26
int max_queues;
27
const VuDevIface *vu_iface;
28
VuDev vu_dev;
29
@@ -XXX,XX +XXX,XX @@ bool vhost_user_server_start(VuServer *server,
30
SocketAddress *unix_socket,
31
AioContext *ctx,
32
uint16_t max_queues,
33
- DevicePanicNotifierFn *device_panic_notifier,
34
const VuDevIface *vu_iface,
35
Error **errp);
36
37
diff --git a/block/export/vhost-user-blk-server.c b/block/export/vhost-user-blk-server.c
38
index XXXXXXX..XXXXXXX 100644
39
--- a/block/export/vhost-user-blk-server.c
40
+++ b/block/export/vhost-user-blk-server.c
41
@@ -XXX,XX +XXX,XX @@ static void vhost_user_blk_server_start(VuBlockDev *vu_block_device,
42
ctx = bdrv_get_aio_context(blk_bs(vu_block_device->backend));
43
44
if (!vhost_user_server_start(&vu_block_device->vu_server, addr, ctx,
45
- VHOST_USER_BLK_MAX_QUEUES,
46
- NULL, &vu_block_iface,
47
+ VHOST_USER_BLK_MAX_QUEUES, &vu_block_iface,
48
errp)) {
49
goto error;
50
}
51
diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
52
index XXXXXXX..XXXXXXX 100644
53
--- a/util/vhost-user-server.c
54
+++ b/util/vhost-user-server.c
55
@@ -XXX,XX +XXX,XX @@ static void panic_cb(VuDev *vu_dev, const char *buf)
56
close_client(server);
57
}
58
59
- if (server->device_panic_notifier) {
60
- server->device_panic_notifier(server);
61
- }
62
-
63
/*
64
* Set the callback function for network listener so another
65
* vhost-user client can connect to this server
66
@@ -XXX,XX +XXX,XX @@ bool vhost_user_server_start(VuServer *server,
67
SocketAddress *socket_addr,
68
AioContext *ctx,
69
uint16_t max_queues,
70
- DevicePanicNotifierFn *device_panic_notifier,
71
const VuDevIface *vu_iface,
72
Error **errp)
73
{
74
@@ -XXX,XX +XXX,XX @@ bool vhost_user_server_start(VuServer *server,
75
.vu_iface = vu_iface,
76
.max_queues = max_queues,
77
.ctx = ctx,
78
- .device_panic_notifier = device_panic_notifier,
79
};
80
81
qio_net_listener_set_name(server->listener, "vhost-user-backend-listener");
82
--
83
2.26.2
84
diff view generated by jsdifflib
New patch
1
fds[] is leaked when qio_channel_readv_full() fails.
1
2
3
Use vmsg->fds[] instead of keeping a local fds[] array. Then we can
4
reuse goto fail to clean up fds. vmsg->fd_num must be zeroed before the
5
loop to make this safe.
6
7
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
8
Message-id: 20200924151549.913737-8-stefanha@redhat.com
9
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10
---
11
util/vhost-user-server.c | 50 ++++++++++++++++++----------------------
12
1 file changed, 23 insertions(+), 27 deletions(-)
13
14
diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
15
index XXXXXXX..XXXXXXX 100644
16
--- a/util/vhost-user-server.c
17
+++ b/util/vhost-user-server.c
18
@@ -XXX,XX +XXX,XX @@ vu_message_read(VuDev *vu_dev, int conn_fd, VhostUserMsg *vmsg)
19
};
20
int rc, read_bytes = 0;
21
Error *local_err = NULL;
22
- /*
23
- * Store fds/nfds returned from qio_channel_readv_full into
24
- * temporary variables.
25
- *
26
- * VhostUserMsg is a packed structure, gcc will complain about passing
27
- * pointer to a packed structure member if we pass &VhostUserMsg.fd_num
28
- * and &VhostUserMsg.fds directly when calling qio_channel_readv_full,
29
- * thus two temporary variables nfds and fds are used here.
30
- */
31
- size_t nfds = 0, nfds_t = 0;
32
const size_t max_fds = G_N_ELEMENTS(vmsg->fds);
33
- int *fds_t = NULL;
34
VuServer *server = container_of(vu_dev, VuServer, vu_dev);
35
QIOChannel *ioc = server->ioc;
36
37
+ vmsg->fd_num = 0;
38
if (!ioc) {
39
error_report_err(local_err);
40
goto fail;
41
@@ -XXX,XX +XXX,XX @@ vu_message_read(VuDev *vu_dev, int conn_fd, VhostUserMsg *vmsg)
42
43
assert(qemu_in_coroutine());
44
do {
45
+ size_t nfds = 0;
46
+ int *fds = NULL;
47
+
48
/*
49
* qio_channel_readv_full may have short reads, keeping calling it
50
* until getting VHOST_USER_HDR_SIZE or 0 bytes in total
51
*/
52
- rc = qio_channel_readv_full(ioc, &iov, 1, &fds_t, &nfds_t, &local_err);
53
+ rc = qio_channel_readv_full(ioc, &iov, 1, &fds, &nfds, &local_err);
54
if (rc < 0) {
55
if (rc == QIO_CHANNEL_ERR_BLOCK) {
56
+ assert(local_err == NULL);
57
qio_channel_yield(ioc, G_IO_IN);
58
continue;
59
} else {
60
error_report_err(local_err);
61
- return false;
62
+ goto fail;
63
}
64
}
65
- read_bytes += rc;
66
- if (nfds_t > 0) {
67
- if (nfds + nfds_t > max_fds) {
68
+
69
+ if (nfds > 0) {
70
+ if (vmsg->fd_num + nfds > max_fds) {
71
error_report("A maximum of %zu fds are allowed, "
72
"however got %zu fds now",
73
- max_fds, nfds + nfds_t);
74
+ max_fds, vmsg->fd_num + nfds);
75
+ g_free(fds);
76
goto fail;
77
}
78
- memcpy(vmsg->fds + nfds, fds_t,
79
- nfds_t *sizeof(vmsg->fds[0]));
80
- nfds += nfds_t;
81
- g_free(fds_t);
82
+ memcpy(vmsg->fds + vmsg->fd_num, fds, nfds * sizeof(vmsg->fds[0]));
83
+ vmsg->fd_num += nfds;
84
+ g_free(fds);
85
}
86
- if (read_bytes == VHOST_USER_HDR_SIZE || rc == 0) {
87
- break;
88
+
89
+ if (rc == 0) { /* socket closed */
90
+ goto fail;
91
}
92
- iov.iov_base = (char *)vmsg + read_bytes;
93
- iov.iov_len = VHOST_USER_HDR_SIZE - read_bytes;
94
- } while (true);
95
96
- vmsg->fd_num = nfds;
97
+ iov.iov_base += rc;
98
+ iov.iov_len -= rc;
99
+ read_bytes += rc;
100
+ } while (read_bytes != VHOST_USER_HDR_SIZE);
101
+
102
/* qio_channel_readv_full will make socket fds blocking, unblock them */
103
vmsg_unblock_fds(vmsg);
104
if (vmsg->size > sizeof(vmsg->payload)) {
105
--
106
2.26.2
107
diff view generated by jsdifflib
New patch
1
Unexpected EOF is an error that must be reported.
1
2
3
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
4
Message-id: 20200924151549.913737-9-stefanha@redhat.com
5
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
6
---
7
util/vhost-user-server.c | 6 ++++--
8
1 file changed, 4 insertions(+), 2 deletions(-)
9
10
diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
11
index XXXXXXX..XXXXXXX 100644
12
--- a/util/vhost-user-server.c
13
+++ b/util/vhost-user-server.c
14
@@ -XXX,XX +XXX,XX @@ vu_message_read(VuDev *vu_dev, int conn_fd, VhostUserMsg *vmsg)
15
};
16
if (vmsg->size) {
17
rc = qio_channel_readv_all_eof(ioc, &iov_payload, 1, &local_err);
18
- if (rc == -1) {
19
- error_report_err(local_err);
20
+ if (rc != 1) {
21
+ if (local_err) {
22
+ error_report_err(local_err);
23
+ }
24
goto fail;
25
}
26
}
27
--
28
2.26.2
29
diff view generated by jsdifflib
1
From: Jagannathan Raman <jag.raman@oracle.com>
1
The vu_client_trip() coroutine is leaked during AioContext switching. It
2
is also unsafe to destroy the vu_dev in panic_cb() since its callers
3
still access it in some cases.
2
4
3
Define and register callbacks to manage the RAM regions used for
5
Rework the lifecycle to solve these safety issues.
4
device DMA
5
6
6
Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
7
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
7
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
8
Message-id: 20200924151549.913737-10-stefanha@redhat.com
8
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
9
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
10
Message-id: faacbcd45c4d02c591f0dbfdc19041fbb3eae7eb.1655151679.git.jag.raman@oracle.com
11
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
9
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
12
---
10
---
13
hw/remote/machine.c | 5 ++++
11
util/vhost-user-server.h | 29 ++--
14
hw/remote/vfio-user-obj.c | 55 +++++++++++++++++++++++++++++++++++++++
12
block/export/vhost-user-blk-server.c | 9 +-
15
hw/remote/trace-events | 2 ++
13
util/vhost-user-server.c | 245 +++++++++++++++------------
16
3 files changed, 62 insertions(+)
14
3 files changed, 155 insertions(+), 128 deletions(-)
17
15
18
diff --git a/hw/remote/machine.c b/hw/remote/machine.c
16
diff --git a/util/vhost-user-server.h b/util/vhost-user-server.h
19
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
20
--- a/hw/remote/machine.c
18
--- a/util/vhost-user-server.h
21
+++ b/hw/remote/machine.c
19
+++ b/util/vhost-user-server.h
22
@@ -XXX,XX +XXX,XX @@
20
@@ -XXX,XX +XXX,XX @@
23
#include "hw/remote/iohub.h"
21
#include "qapi/error.h"
24
#include "hw/remote/iommu.h"
22
#include "standard-headers/linux/virtio_blk.h"
25
#include "hw/qdev-core.h"
23
26
+#include "hw/remote/iommu.h"
24
+/* A kick fd that we monitor on behalf of libvhost-user */
27
25
typedef struct VuFdWatch {
28
static void remote_machine_init(MachineState *machine)
26
VuDev *vu_dev;
29
{
27
int fd; /*kick fd*/
30
@@ -XXX,XX +XXX,XX @@ static void remote_machine_init(MachineState *machine)
28
void *pvt;
31
29
vu_watch_cb cb;
32
pci_host = PCI_HOST_BRIDGE(rem_host);
30
- bool processing;
33
31
QTAILQ_ENTRY(VuFdWatch) next;
34
+ if (s->vfio_user) {
32
} VuFdWatch;
35
+ remote_iommu_setup(pci_host->bus);
33
34
-typedef struct VuServer VuServer;
35
-
36
-struct VuServer {
37
+/**
38
+ * VuServer:
39
+ * A vhost-user server instance with user-defined VuDevIface callbacks.
40
+ * Vhost-user device backends can be implemented using VuServer. VuDevIface
41
+ * callbacks and virtqueue kicks run in the given AioContext.
42
+ */
43
+typedef struct {
44
QIONetListener *listener;
45
+ QEMUBH *restart_listener_bh;
46
AioContext *ctx;
47
int max_queues;
48
const VuDevIface *vu_iface;
49
+
50
+ /* Protected by ctx lock */
51
VuDev vu_dev;
52
QIOChannel *ioc; /* The I/O channel with the client */
53
QIOChannelSocket *sioc; /* The underlying data channel with the client */
54
- /* IOChannel for fd provided via VHOST_USER_SET_SLAVE_REQ_FD */
55
- QIOChannel *ioc_slave;
56
- QIOChannelSocket *sioc_slave;
57
- Coroutine *co_trip; /* coroutine for processing VhostUserMsg */
58
QTAILQ_HEAD(, VuFdWatch) vu_fd_watches;
59
- /* restart coroutine co_trip if AIOContext is changed */
60
- bool aio_context_changed;
61
- bool processing_msg;
62
-};
63
+
64
+ Coroutine *co_trip; /* coroutine for processing VhostUserMsg */
65
+} VuServer;
66
67
bool vhost_user_server_start(VuServer *server,
68
SocketAddress *unix_socket,
69
@@ -XXX,XX +XXX,XX @@ bool vhost_user_server_start(VuServer *server,
70
71
void vhost_user_server_stop(VuServer *server);
72
73
-void vhost_user_server_set_aio_context(VuServer *server, AioContext *ctx);
74
+void vhost_user_server_attach_aio_context(VuServer *server, AioContext *ctx);
75
+void vhost_user_server_detach_aio_context(VuServer *server);
76
77
#endif /* VHOST_USER_SERVER_H */
78
diff --git a/block/export/vhost-user-blk-server.c b/block/export/vhost-user-blk-server.c
79
index XXXXXXX..XXXXXXX 100644
80
--- a/block/export/vhost-user-blk-server.c
81
+++ b/block/export/vhost-user-blk-server.c
82
@@ -XXX,XX +XXX,XX @@ static const VuDevIface vu_block_iface = {
83
static void blk_aio_attached(AioContext *ctx, void *opaque)
84
{
85
VuBlockDev *vub_dev = opaque;
86
- aio_context_acquire(ctx);
87
- vhost_user_server_set_aio_context(&vub_dev->vu_server, ctx);
88
- aio_context_release(ctx);
89
+ vhost_user_server_attach_aio_context(&vub_dev->vu_server, ctx);
90
}
91
92
static void blk_aio_detach(void *opaque)
93
{
94
VuBlockDev *vub_dev = opaque;
95
- AioContext *ctx = vub_dev->vu_server.ctx;
96
- aio_context_acquire(ctx);
97
- vhost_user_server_set_aio_context(&vub_dev->vu_server, NULL);
98
- aio_context_release(ctx);
99
+ vhost_user_server_detach_aio_context(&vub_dev->vu_server);
100
}
101
102
static void
103
diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
104
index XXXXXXX..XXXXXXX 100644
105
--- a/util/vhost-user-server.c
106
+++ b/util/vhost-user-server.c
107
@@ -XXX,XX +XXX,XX @@
108
*/
109
#include "qemu/osdep.h"
110
#include "qemu/main-loop.h"
111
+#include "block/aio-wait.h"
112
#include "vhost-user-server.h"
113
114
+/*
115
+ * Theory of operation:
116
+ *
117
+ * VuServer is started and stopped by vhost_user_server_start() and
118
+ * vhost_user_server_stop() from the main loop thread. Starting the server
119
+ * opens a vhost-user UNIX domain socket and listens for incoming connections.
120
+ * Only one connection is allowed at a time.
121
+ *
122
+ * The connection is handled by the vu_client_trip() coroutine in the
123
+ * VuServer->ctx AioContext. The coroutine consists of a vu_dispatch() loop
124
+ * where libvhost-user calls vu_message_read() to receive the next vhost-user
125
+ * protocol messages over the UNIX domain socket.
126
+ *
127
+ * When virtqueues are set up libvhost-user calls set_watch() to monitor kick
128
+ * fds. These fds are also handled in the VuServer->ctx AioContext.
129
+ *
130
+ * Both vu_client_trip() and kick fd monitoring can be stopped by shutting down
131
+ * the socket connection. Shutting down the socket connection causes
132
+ * vu_message_read() to fail since no more data can be received from the socket.
133
+ * After vu_dispatch() fails, vu_client_trip() calls vu_deinit() to stop
134
+ * libvhost-user before terminating the coroutine. vu_deinit() calls
135
+ * remove_watch() to stop monitoring kick fds and this stops virtqueue
136
+ * processing.
137
+ *
138
+ * When vu_client_trip() has finished cleaning up it schedules a BH in the main
139
+ * loop thread to accept the next client connection.
140
+ *
141
+ * When libvhost-user detects an error it calls panic_cb() and sets the
142
+ * dev->broken flag. Both vu_client_trip() and kick fd processing stop when
143
+ * the dev->broken flag is set.
144
+ *
145
+ * It is possible to switch AioContexts using
146
+ * vhost_user_server_detach_aio_context() and
147
+ * vhost_user_server_attach_aio_context(). They stop monitoring fds in the old
148
+ * AioContext and resume monitoring in the new AioContext. The vu_client_trip()
149
+ * coroutine remains in a yielded state during the switch. This is made
150
+ * possible by QIOChannel's support for spurious coroutine re-entry in
151
+ * qio_channel_yield(). The coroutine will restart I/O when re-entered from the
152
+ * new AioContext.
153
+ */
154
+
155
static void vmsg_close_fds(VhostUserMsg *vmsg)
156
{
157
int i;
158
@@ -XXX,XX +XXX,XX @@ static void vmsg_unblock_fds(VhostUserMsg *vmsg)
159
}
160
}
161
162
-static void vu_accept(QIONetListener *listener, QIOChannelSocket *sioc,
163
- gpointer opaque);
164
-
165
-static void close_client(VuServer *server)
166
-{
167
- /*
168
- * Before closing the client
169
- *
170
- * 1. Let vu_client_trip stop processing new vhost-user msg
171
- *
172
- * 2. remove kick_handler
173
- *
174
- * 3. wait for the kick handler to be finished
175
- *
176
- * 4. wait for the current vhost-user msg to be finished processing
177
- */
178
-
179
- QIOChannelSocket *sioc = server->sioc;
180
- /* When this is set vu_client_trip will stop new processing vhost-user message */
181
- server->sioc = NULL;
182
-
183
- while (server->processing_msg) {
184
- if (server->ioc->read_coroutine) {
185
- server->ioc->read_coroutine = NULL;
186
- qio_channel_set_aio_fd_handler(server->ioc, server->ioc->ctx, NULL,
187
- NULL, server->ioc);
188
- server->processing_msg = false;
189
- }
190
- }
191
-
192
- vu_deinit(&server->vu_dev);
193
-
194
- /* vu_deinit() should have called remove_watch() */
195
- assert(QTAILQ_EMPTY(&server->vu_fd_watches));
196
-
197
- object_unref(OBJECT(sioc));
198
- object_unref(OBJECT(server->ioc));
199
-}
200
-
201
static void panic_cb(VuDev *vu_dev, const char *buf)
202
{
203
- VuServer *server = container_of(vu_dev, VuServer, vu_dev);
204
-
205
- /* avoid while loop in close_client */
206
- server->processing_msg = false;
207
-
208
- if (buf) {
209
- error_report("vu_panic: %s", buf);
210
- }
211
-
212
- if (server->sioc) {
213
- close_client(server);
214
- }
215
-
216
- /*
217
- * Set the callback function for network listener so another
218
- * vhost-user client can connect to this server
219
- */
220
- qio_net_listener_set_client_func(server->listener,
221
- vu_accept,
222
- server,
223
- NULL);
224
+ error_report("vu_panic: %s", buf);
225
}
226
227
static bool coroutine_fn
228
@@ -XXX,XX +XXX,XX @@ fail:
229
return false;
230
}
231
232
-
233
-static void vu_client_start(VuServer *server);
234
static coroutine_fn void vu_client_trip(void *opaque)
235
{
236
VuServer *server = opaque;
237
+ VuDev *vu_dev = &server->vu_dev;
238
239
- while (!server->aio_context_changed && server->sioc) {
240
- server->processing_msg = true;
241
- vu_dispatch(&server->vu_dev);
242
- server->processing_msg = false;
243
+ while (!vu_dev->broken && vu_dispatch(vu_dev)) {
244
+ /* Keep running */
245
}
246
247
- if (server->aio_context_changed && server->sioc) {
248
- server->aio_context_changed = false;
249
- vu_client_start(server);
250
- }
251
-}
252
+ vu_deinit(vu_dev);
253
+
254
+ /* vu_deinit() should have called remove_watch() */
255
+ assert(QTAILQ_EMPTY(&server->vu_fd_watches));
256
+
257
+ object_unref(OBJECT(server->sioc));
258
+ server->sioc = NULL;
259
260
-static void vu_client_start(VuServer *server)
261
-{
262
- server->co_trip = qemu_coroutine_create(vu_client_trip, server);
263
- aio_co_enter(server->ctx, server->co_trip);
264
+ object_unref(OBJECT(server->ioc));
265
+ server->ioc = NULL;
266
+
267
+ server->co_trip = NULL;
268
+ if (server->restart_listener_bh) {
269
+ qemu_bh_schedule(server->restart_listener_bh);
36
+ }
270
+ }
37
+
271
+ aio_wait_kick();
38
remote_iohub_init(&s->iohub);
272
}
39
273
40
pci_bus_irqs(pci_host->bus, remote_iohub_set_irq, remote_iohub_map_irq,
274
/*
41
diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c
275
@@ -XXX,XX +XXX,XX @@ static void vu_client_start(VuServer *server)
42
index XXXXXXX..XXXXXXX 100644
276
static void kick_handler(void *opaque)
43
--- a/hw/remote/vfio-user-obj.c
277
{
44
+++ b/hw/remote/vfio-user-obj.c
278
VuFdWatch *vu_fd_watch = opaque;
45
@@ -XXX,XX +XXX,XX @@ static ssize_t vfu_object_cfg_access(vfu_ctx_t *vfu_ctx, char * const buf,
279
- vu_fd_watch->processing = true;
46
return count;
280
- vu_fd_watch->cb(vu_fd_watch->vu_dev, 0, vu_fd_watch->pvt);
47
}
281
- vu_fd_watch->processing = false;
48
282
+ VuDev *vu_dev = vu_fd_watch->vu_dev;
49
+static void dma_register(vfu_ctx_t *vfu_ctx, vfu_dma_info_t *info)
283
+
284
+ vu_fd_watch->cb(vu_dev, 0, vu_fd_watch->pvt);
285
+
286
+ /* Stop vu_client_trip() if an error occurred in vu_fd_watch->cb() */
287
+ if (vu_dev->broken) {
288
+ VuServer *server = container_of(vu_dev, VuServer, vu_dev);
289
+
290
+ qio_channel_shutdown(server->ioc, QIO_CHANNEL_SHUTDOWN_BOTH, NULL);
291
+ }
292
}
293
294
-
295
static VuFdWatch *find_vu_fd_watch(VuServer *server, int fd)
296
{
297
298
@@ -XXX,XX +XXX,XX @@ static void vu_accept(QIONetListener *listener, QIOChannelSocket *sioc,
299
qio_channel_set_name(QIO_CHANNEL(sioc), "vhost-user client");
300
server->ioc = QIO_CHANNEL(sioc);
301
object_ref(OBJECT(server->ioc));
302
- qio_channel_attach_aio_context(server->ioc, server->ctx);
303
+
304
+ /* TODO vu_message_write() spins if non-blocking! */
305
qio_channel_set_blocking(server->ioc, false, NULL);
306
- vu_client_start(server);
307
+
308
+ server->co_trip = qemu_coroutine_create(vu_client_trip, server);
309
+
310
+ aio_context_acquire(server->ctx);
311
+ vhost_user_server_attach_aio_context(server, server->ctx);
312
+ aio_context_release(server->ctx);
313
}
314
315
-
316
void vhost_user_server_stop(VuServer *server)
317
{
318
+ aio_context_acquire(server->ctx);
319
+
320
+ qemu_bh_delete(server->restart_listener_bh);
321
+ server->restart_listener_bh = NULL;
322
+
323
if (server->sioc) {
324
- close_client(server);
325
+ VuFdWatch *vu_fd_watch;
326
+
327
+ QTAILQ_FOREACH(vu_fd_watch, &server->vu_fd_watches, next) {
328
+ aio_set_fd_handler(server->ctx, vu_fd_watch->fd, true,
329
+ NULL, NULL, NULL, vu_fd_watch);
330
+ }
331
+
332
+ qio_channel_shutdown(server->ioc, QIO_CHANNEL_SHUTDOWN_BOTH, NULL);
333
+
334
+ AIO_WAIT_WHILE(server->ctx, server->co_trip);
335
}
336
337
+ aio_context_release(server->ctx);
338
+
339
if (server->listener) {
340
qio_net_listener_disconnect(server->listener);
341
object_unref(OBJECT(server->listener));
342
}
343
+}
344
+
345
+/*
346
+ * Allow the next client to connect to the server. Called from a BH in the main
347
+ * loop.
348
+ */
349
+static void restart_listener_bh(void *opaque)
50
+{
350
+{
51
+ VfuObject *o = vfu_get_private(vfu_ctx);
351
+ VuServer *server = opaque;
52
+ AddressSpace *dma_as = NULL;
352
53
+ MemoryRegion *subregion = NULL;
353
+ qio_net_listener_set_client_func(server->listener, vu_accept, server,
54
+ g_autofree char *name = NULL;
354
+ NULL);
55
+ struct iovec *iov = &info->iova;
355
}
56
+
356
57
+ if (!info->vaddr) {
357
-void vhost_user_server_set_aio_context(VuServer *server, AioContext *ctx)
58
+ return;
358
+/* Called with ctx acquired */
359
+void vhost_user_server_attach_aio_context(VuServer *server, AioContext *ctx)
360
{
361
- VuFdWatch *vu_fd_watch, *next;
362
- void *opaque = NULL;
363
- IOHandler *io_read = NULL;
364
- bool attach;
365
+ VuFdWatch *vu_fd_watch;
366
367
- server->ctx = ctx ? ctx : qemu_get_aio_context();
368
+ server->ctx = ctx;
369
370
if (!server->sioc) {
371
- /* not yet serving any client*/
372
return;
373
}
374
375
- if (ctx) {
376
- qio_channel_attach_aio_context(server->ioc, ctx);
377
- server->aio_context_changed = true;
378
- io_read = kick_handler;
379
- attach = true;
380
- } else {
381
+ qio_channel_attach_aio_context(server->ioc, ctx);
382
+
383
+ QTAILQ_FOREACH(vu_fd_watch, &server->vu_fd_watches, next) {
384
+ aio_set_fd_handler(ctx, vu_fd_watch->fd, true, kick_handler, NULL,
385
+ NULL, vu_fd_watch);
59
+ }
386
+ }
60
+
387
+
61
+ name = g_strdup_printf("mem-%s-%"PRIx64"", o->device,
388
+ aio_co_schedule(ctx, server->co_trip);
62
+ (uint64_t)info->vaddr);
63
+
64
+ subregion = g_new0(MemoryRegion, 1);
65
+
66
+ memory_region_init_ram_ptr(subregion, NULL, name,
67
+ iov->iov_len, info->vaddr);
68
+
69
+ dma_as = pci_device_iommu_address_space(o->pci_dev);
70
+
71
+ memory_region_add_subregion(dma_as->root, (hwaddr)iov->iov_base, subregion);
72
+
73
+ trace_vfu_dma_register((uint64_t)iov->iov_base, iov->iov_len);
74
+}
389
+}
75
+
390
+
76
+static void dma_unregister(vfu_ctx_t *vfu_ctx, vfu_dma_info_t *info)
391
+/* Called with server->ctx acquired */
392
+void vhost_user_server_detach_aio_context(VuServer *server)
77
+{
393
+{
78
+ VfuObject *o = vfu_get_private(vfu_ctx);
394
+ if (server->sioc) {
79
+ AddressSpace *dma_as = NULL;
395
+ VuFdWatch *vu_fd_watch;
80
+ MemoryRegion *mr = NULL;
396
+
81
+ ram_addr_t offset;
397
+ QTAILQ_FOREACH(vu_fd_watch, &server->vu_fd_watches, next) {
82
+
398
+ aio_set_fd_handler(server->ctx, vu_fd_watch->fd, true,
83
+ mr = memory_region_from_host(info->vaddr, &offset);
399
+ NULL, NULL, NULL, vu_fd_watch);
84
+ if (!mr) {
400
+ }
85
+ return;
401
+
86
+ }
402
qio_channel_detach_aio_context(server->ioc);
87
+
403
- /* server->ioc->ctx keeps the old AioConext */
88
+ dma_as = pci_device_iommu_address_space(o->pci_dev);
404
- ctx = server->ioc->ctx;
89
+
405
- attach = false;
90
+ memory_region_del_subregion(dma_as->root, mr);
406
}
91
+
407
92
+ object_unparent((OBJECT(mr)));
408
- QTAILQ_FOREACH_SAFE(vu_fd_watch, &server->vu_fd_watches, next, next) {
93
+
409
- if (vu_fd_watch->cb) {
94
+ trace_vfu_dma_unregister((uint64_t)info->iova.iov_base);
410
- opaque = attach ? vu_fd_watch : NULL;
95
+}
411
- aio_set_fd_handler(ctx, vu_fd_watch->fd, true,
96
+
412
- io_read, NULL, NULL,
97
/*
413
- opaque);
98
* TYPE_VFU_OBJECT depends on the availability of the 'socket' and 'device'
414
- }
99
* properties. It also depends on devices instantiated in QEMU. These
415
- }
100
@@ -XXX,XX +XXX,XX @@ static void vfu_object_init_ctx(VfuObject *o, Error **errp)
416
+ server->ctx = NULL;
101
goto fail;
417
}
102
}
418
103
419
-
104
+ ret = vfu_setup_device_dma(o->vfu_ctx, &dma_register, &dma_unregister);
420
bool vhost_user_server_start(VuServer *server,
105
+ if (ret < 0) {
421
SocketAddress *socket_addr,
106
+ error_setg(errp, "vfu: Failed to setup DMA handlers for %s",
422
AioContext *ctx,
107
+ o->device);
423
@@ -XXX,XX +XXX,XX @@ bool vhost_user_server_start(VuServer *server,
108
+ goto fail;
424
const VuDevIface *vu_iface,
109
+ }
425
Error **errp)
110
+
426
{
111
ret = vfu_realize_ctx(o->vfu_ctx);
427
+ QEMUBH *bh;
112
if (ret < 0) {
428
QIONetListener *listener = qio_net_listener_new();
113
error_setg(errp, "vfu: Failed to realize device %s- %s",
429
if (qio_net_listener_open_sync(listener, socket_addr, 1,
114
diff --git a/hw/remote/trace-events b/hw/remote/trace-events
430
errp) < 0) {
115
index XXXXXXX..XXXXXXX 100644
431
@@ -XXX,XX +XXX,XX @@ bool vhost_user_server_start(VuServer *server,
116
--- a/hw/remote/trace-events
432
return false;
117
+++ b/hw/remote/trace-events
433
}
118
@@ -XXX,XX +XXX,XX @@ mpqemu_recv_io_error(int cmd, int size, int nfds) "failed to receive %d size %d,
434
119
vfu_prop(const char *prop, const char *val) "vfu: setting %s as %s"
435
+ bh = qemu_bh_new(restart_listener_bh, server);
120
vfu_cfg_read(uint32_t offset, uint32_t val) "vfu: cfg: 0x%u -> 0x%x"
436
+
121
vfu_cfg_write(uint32_t offset, uint32_t val) "vfu: cfg: 0x%u <- 0x%x"
437
/* zero out unspecified fields */
122
+vfu_dma_register(uint64_t gpa, size_t len) "vfu: registering GPA 0x%"PRIx64", %zu bytes"
438
*server = (VuServer) {
123
+vfu_dma_unregister(uint64_t gpa) "vfu: unregistering GPA 0x%"PRIx64""
439
.listener = listener,
440
+ .restart_listener_bh = bh,
441
.vu_iface = vu_iface,
442
.max_queues = max_queues,
443
.ctx = ctx,
124
--
444
--
125
2.36.1
445
2.26.2
446
diff view generated by jsdifflib
New patch
1
Propagate the flush return value since errors are possible.
1
2
3
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
4
Message-id: 20200924151549.913737-11-stefanha@redhat.com
5
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
6
---
7
block/export/vhost-user-blk-server.c | 11 +++++++----
8
1 file changed, 7 insertions(+), 4 deletions(-)
9
10
diff --git a/block/export/vhost-user-blk-server.c b/block/export/vhost-user-blk-server.c
11
index XXXXXXX..XXXXXXX 100644
12
--- a/block/export/vhost-user-blk-server.c
13
+++ b/block/export/vhost-user-blk-server.c
14
@@ -XXX,XX +XXX,XX @@ vu_block_discard_write_zeroes(VuBlockReq *req, struct iovec *iov,
15
return -EINVAL;
16
}
17
18
-static void coroutine_fn vu_block_flush(VuBlockReq *req)
19
+static int coroutine_fn vu_block_flush(VuBlockReq *req)
20
{
21
VuBlockDev *vdev_blk = get_vu_block_device_by_server(req->server);
22
BlockBackend *backend = vdev_blk->backend;
23
- blk_co_flush(backend);
24
+ return blk_co_flush(backend);
25
}
26
27
static void coroutine_fn vu_block_virtio_process_req(void *opaque)
28
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn vu_block_virtio_process_req(void *opaque)
29
break;
30
}
31
case VIRTIO_BLK_T_FLUSH:
32
- vu_block_flush(req);
33
- req->in->status = VIRTIO_BLK_S_OK;
34
+ if (vu_block_flush(req) == 0) {
35
+ req->in->status = VIRTIO_BLK_S_OK;
36
+ } else {
37
+ req->in->status = VIRTIO_BLK_S_IOERR;
38
+ }
39
break;
40
case VIRTIO_BLK_T_GET_ID: {
41
size_t size = MIN(iov_size(&elem->in_sg[0], in_num),
42
--
43
2.26.2
44
diff view generated by jsdifflib
1
From: Jagannathan Raman <jag.raman@oracle.com>
1
Use the new QAPI block exports API instead of defining our own QOM
2
2
objects.
3
Add blocker to prevent hot-unplug of devices
3
4
4
This is a large change because the lifecycle of VuBlockDev needs to
5
TYPE_VFIO_USER_SERVER, which is introduced shortly, attaches itself to a
5
follow BlockExportDriver. QOM properties are replaced by QAPI options
6
PCIDevice on which it depends. If the attached PCIDevice gets removed
6
objects.
7
while the server in use, it could cause it crash. To prevent this,
7
8
TYPE_VFIO_USER_SERVER adds an unplug blocker for the PCIDevice.
8
VuBlockDev is renamed VuBlkExport and contains a BlockExport field.
9
9
Several fields can be dropped since BlockExport already has equivalents.
10
Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
10
11
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
11
The file names and meson build integration will be adjusted in a future
12
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
12
patch. libvhost-user should probably be built as a static library that
13
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
13
is linked into QEMU instead of as a .c file that results in duplicate
14
Message-id: c41ef80b7cc063314d629737bed2159e5713f2e0.1655151679.git.jag.raman@oracle.com
14
compilation.
15
16
The new command-line syntax is:
17
18
$ qemu-storage-daemon \
19
--blockdev file,node-name=drive0,filename=test.img \
20
--export vhost-user-blk,node-name=drive0,id=export0,unix-socket=/tmp/vhost-user-blk.sock
21
22
Note that unix-socket is optional because we may wish to accept chardevs
23
too in the future.
24
25
Markus noted that supported address families are not explicit in the
26
QAPI schema. It is unlikely that support for more address families will
27
be added since file descriptor passing is required and few address
28
families support it. If a new address family needs to be added, then the
29
QAPI 'features' syntax can be used to advertize them.
30
31
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
32
Acked-by: Markus Armbruster <armbru@redhat.com>
33
Message-id: 20200924151549.913737-12-stefanha@redhat.com
34
[Skip test on big-endian host architectures because this device doesn't
35
support them yet (as already mentioned in a code comment).
36
--Stefan]
15
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
37
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
16
---
38
---
17
include/hw/qdev-core.h | 29 +++++++++++++++++++++++++++++
39
qapi/block-export.json | 21 +-
18
hw/core/qdev.c | 24 ++++++++++++++++++++++++
40
block/export/vhost-user-blk-server.h | 23 +-
19
softmmu/qdev-monitor.c | 4 ++++
41
block/export/export.c | 6 +
20
3 files changed, 57 insertions(+)
42
block/export/vhost-user-blk-server.c | 452 +++++++--------------------
21
43
util/vhost-user-server.c | 10 +-
22
diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
44
block/export/meson.build | 1 +
45
block/meson.build | 1 -
46
7 files changed, 156 insertions(+), 358 deletions(-)
47
48
diff --git a/qapi/block-export.json b/qapi/block-export.json
23
index XXXXXXX..XXXXXXX 100644
49
index XXXXXXX..XXXXXXX 100644
24
--- a/include/hw/qdev-core.h
50
--- a/qapi/block-export.json
25
+++ b/include/hw/qdev-core.h
51
+++ b/qapi/block-export.json
26
@@ -XXX,XX +XXX,XX @@ struct DeviceState {
52
@@ -XXX,XX +XXX,XX @@
27
int instance_id_alias;
53
'data': { '*name': 'str', '*description': 'str',
28
int alias_required_for_version;
54
'*bitmap': 'str' } }
29
ResettableState reset;
55
30
+ GSList *unplug_blockers;
56
+##
57
+# @BlockExportOptionsVhostUserBlk:
58
+#
59
+# A vhost-user-blk block export.
60
+#
61
+# @addr: The vhost-user socket on which to listen. Both 'unix' and 'fd'
62
+# SocketAddress types are supported. Passed fds must be UNIX domain
63
+# sockets.
64
+# @logical-block-size: Logical block size in bytes. Defaults to 512 bytes.
65
+#
66
+# Since: 5.2
67
+##
68
+{ 'struct': 'BlockExportOptionsVhostUserBlk',
69
+ 'data': { 'addr': 'SocketAddress', '*logical-block-size': 'size' } }
70
+
71
##
72
# @NbdServerAddOptions:
73
#
74
@@ -XXX,XX +XXX,XX @@
75
# An enumeration of block export types
76
#
77
# @nbd: NBD export
78
+# @vhost-user-blk: vhost-user-blk export (since 5.2)
79
#
80
# Since: 4.2
81
##
82
{ 'enum': 'BlockExportType',
83
- 'data': [ 'nbd' ] }
84
+ 'data': [ 'nbd', 'vhost-user-blk' ] }
85
86
##
87
# @BlockExportOptions:
88
@@ -XXX,XX +XXX,XX @@
89
'*writethrough': 'bool' },
90
'discriminator': 'type',
91
'data': {
92
- 'nbd': 'BlockExportOptionsNbd'
93
+ 'nbd': 'BlockExportOptionsNbd',
94
+ 'vhost-user-blk': 'BlockExportOptionsVhostUserBlk'
95
} }
96
97
##
98
diff --git a/block/export/vhost-user-blk-server.h b/block/export/vhost-user-blk-server.h
99
index XXXXXXX..XXXXXXX 100644
100
--- a/block/export/vhost-user-blk-server.h
101
+++ b/block/export/vhost-user-blk-server.h
102
@@ -XXX,XX +XXX,XX @@
103
104
#ifndef VHOST_USER_BLK_SERVER_H
105
#define VHOST_USER_BLK_SERVER_H
106
-#include "util/vhost-user-server.h"
107
108
-typedef struct VuBlockDev VuBlockDev;
109
-#define TYPE_VHOST_USER_BLK_SERVER "vhost-user-blk-server"
110
-#define VHOST_USER_BLK_SERVER(obj) \
111
- OBJECT_CHECK(VuBlockDev, obj, TYPE_VHOST_USER_BLK_SERVER)
112
+#include "block/export.h"
113
114
-/* vhost user block device */
115
-struct VuBlockDev {
116
- Object parent_obj;
117
- char *node_name;
118
- SocketAddress *addr;
119
- AioContext *ctx;
120
- VuServer vu_server;
121
- bool running;
122
- uint32_t blk_size;
123
- BlockBackend *backend;
124
- QIOChannelSocket *sioc;
125
- QTAILQ_ENTRY(VuBlockDev) next;
126
- struct virtio_blk_config blkcfg;
127
- bool writable;
128
-};
129
+/* For block/export/export.c */
130
+extern const BlockExportDriver blk_exp_vhost_user_blk;
131
132
#endif /* VHOST_USER_BLK_SERVER_H */
133
diff --git a/block/export/export.c b/block/export/export.c
134
index XXXXXXX..XXXXXXX 100644
135
--- a/block/export/export.c
136
+++ b/block/export/export.c
137
@@ -XXX,XX +XXX,XX @@
138
#include "sysemu/block-backend.h"
139
#include "block/export.h"
140
#include "block/nbd.h"
141
+#if CONFIG_LINUX
142
+#include "block/export/vhost-user-blk-server.h"
143
+#endif
144
#include "qapi/error.h"
145
#include "qapi/qapi-commands-block-export.h"
146
#include "qapi/qapi-events-block-export.h"
147
@@ -XXX,XX +XXX,XX @@
148
149
static const BlockExportDriver *blk_exp_drivers[] = {
150
&blk_exp_nbd,
151
+#if CONFIG_LINUX
152
+ &blk_exp_vhost_user_blk,
153
+#endif
31
};
154
};
32
155
33
struct DeviceListener {
156
/* Only accessed from the main thread */
34
@@ -XXX,XX +XXX,XX @@ void qdev_simple_device_unplug_cb(HotplugHandler *hotplug_dev,
157
diff --git a/block/export/vhost-user-blk-server.c b/block/export/vhost-user-blk-server.c
35
void qdev_machine_creation_done(void);
158
index XXXXXXX..XXXXXXX 100644
36
bool qdev_machine_modified(void);
159
--- a/block/export/vhost-user-blk-server.c
37
160
+++ b/block/export/vhost-user-blk-server.c
38
+/**
161
@@ -XXX,XX +XXX,XX @@
39
+ * qdev_add_unplug_blocker: Add an unplug blocker to a device
162
*/
40
+ *
163
#include "qemu/osdep.h"
41
+ * @dev: Device to be blocked from unplug
164
#include "block/block.h"
42
+ * @reason: Reason for blocking
165
+#include "contrib/libvhost-user/libvhost-user.h"
43
+ */
166
+#include "standard-headers/linux/virtio_blk.h"
44
+void qdev_add_unplug_blocker(DeviceState *dev, Error *reason);
167
+#include "util/vhost-user-server.h"
168
#include "vhost-user-blk-server.h"
169
#include "qapi/error.h"
170
#include "qom/object_interfaces.h"
171
@@ -XXX,XX +XXX,XX @@ struct virtio_blk_inhdr {
172
unsigned char status;
173
};
174
175
-typedef struct VuBlockReq {
176
+typedef struct VuBlkReq {
177
VuVirtqElement elem;
178
int64_t sector_num;
179
size_t size;
180
@@ -XXX,XX +XXX,XX @@ typedef struct VuBlockReq {
181
struct virtio_blk_outhdr out;
182
VuServer *server;
183
struct VuVirtq *vq;
184
-} VuBlockReq;
185
+} VuBlkReq;
186
187
-static void vu_block_req_complete(VuBlockReq *req)
188
+/* vhost user block device */
189
+typedef struct {
190
+ BlockExport export;
191
+ VuServer vu_server;
192
+ uint32_t blk_size;
193
+ QIOChannelSocket *sioc;
194
+ struct virtio_blk_config blkcfg;
195
+ bool writable;
196
+} VuBlkExport;
45
+
197
+
46
+/**
198
+static void vu_blk_req_complete(VuBlkReq *req)
47
+ * qdev_del_unplug_blocker: Remove an unplug blocker from a device
199
{
48
+ *
200
VuDev *vu_dev = &req->server->vu_dev;
49
+ * @dev: Device to be unblocked
201
50
+ * @reason: Pointer to the Error used with qdev_add_unplug_blocker.
202
@@ -XXX,XX +XXX,XX @@ static void vu_block_req_complete(VuBlockReq *req)
51
+ * Used as a handle to lookup the blocker for deletion.
203
free(req);
52
+ */
204
}
53
+void qdev_del_unplug_blocker(DeviceState *dev, Error *reason);
205
206
-static VuBlockDev *get_vu_block_device_by_server(VuServer *server)
207
-{
208
- return container_of(server, VuBlockDev, vu_server);
209
-}
210
-
211
static int coroutine_fn
212
-vu_block_discard_write_zeroes(VuBlockReq *req, struct iovec *iov,
213
- uint32_t iovcnt, uint32_t type)
214
+vu_blk_discard_write_zeroes(BlockBackend *blk, struct iovec *iov,
215
+ uint32_t iovcnt, uint32_t type)
216
{
217
struct virtio_blk_discard_write_zeroes desc;
218
ssize_t size = iov_to_buf(iov, iovcnt, 0, &desc, sizeof(desc));
219
@@ -XXX,XX +XXX,XX @@ vu_block_discard_write_zeroes(VuBlockReq *req, struct iovec *iov,
220
return -EINVAL;
221
}
222
223
- VuBlockDev *vdev_blk = get_vu_block_device_by_server(req->server);
224
uint64_t range[2] = { le64_to_cpu(desc.sector) << 9,
225
le32_to_cpu(desc.num_sectors) << 9 };
226
if (type == VIRTIO_BLK_T_DISCARD) {
227
- if (blk_co_pdiscard(vdev_blk->backend, range[0], range[1]) == 0) {
228
+ if (blk_co_pdiscard(blk, range[0], range[1]) == 0) {
229
return 0;
230
}
231
} else if (type == VIRTIO_BLK_T_WRITE_ZEROES) {
232
- if (blk_co_pwrite_zeroes(vdev_blk->backend,
233
- range[0], range[1], 0) == 0) {
234
+ if (blk_co_pwrite_zeroes(blk, range[0], range[1], 0) == 0) {
235
return 0;
236
}
237
}
238
@@ -XXX,XX +XXX,XX @@ vu_block_discard_write_zeroes(VuBlockReq *req, struct iovec *iov,
239
return -EINVAL;
240
}
241
242
-static int coroutine_fn vu_block_flush(VuBlockReq *req)
243
+static void coroutine_fn vu_blk_virtio_process_req(void *opaque)
244
{
245
- VuBlockDev *vdev_blk = get_vu_block_device_by_server(req->server);
246
- BlockBackend *backend = vdev_blk->backend;
247
- return blk_co_flush(backend);
248
-}
249
-
250
-static void coroutine_fn vu_block_virtio_process_req(void *opaque)
251
-{
252
- VuBlockReq *req = opaque;
253
+ VuBlkReq *req = opaque;
254
VuServer *server = req->server;
255
VuVirtqElement *elem = &req->elem;
256
uint32_t type;
257
258
- VuBlockDev *vdev_blk = get_vu_block_device_by_server(server);
259
- BlockBackend *backend = vdev_blk->backend;
260
+ VuBlkExport *vexp = container_of(server, VuBlkExport, vu_server);
261
+ BlockBackend *blk = vexp->export.blk;
262
263
struct iovec *in_iov = elem->in_sg;
264
struct iovec *out_iov = elem->out_sg;
265
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn vu_block_virtio_process_req(void *opaque)
266
bool is_write = type & VIRTIO_BLK_T_OUT;
267
req->sector_num = le64_to_cpu(req->out.sector);
268
269
- int64_t offset = req->sector_num * vdev_blk->blk_size;
270
+ if (is_write && !vexp->writable) {
271
+ req->in->status = VIRTIO_BLK_S_IOERR;
272
+ break;
273
+ }
54
+
274
+
55
+/**
275
+ int64_t offset = req->sector_num * vexp->blk_size;
56
+ * qdev_unplug_blocked: Confirm if a device is blocked from unplug
276
QEMUIOVector qiov;
57
+ *
277
if (is_write) {
58
+ * @dev: Device to be tested
278
qemu_iovec_init_external(&qiov, out_iov, out_num);
59
+ * @reason: Returns one of the reasons why the device is blocked,
279
- ret = blk_co_pwritev(backend, offset, qiov.size,
60
+ * if any
280
- &qiov, 0);
61
+ *
281
+ ret = blk_co_pwritev(blk, offset, qiov.size, &qiov, 0);
62
+ * Returns: true if device is blocked from unplug, false otherwise
282
} else {
63
+ */
283
qemu_iovec_init_external(&qiov, in_iov, in_num);
64
+bool qdev_unplug_blocked(DeviceState *dev, Error **errp);
284
- ret = blk_co_preadv(backend, offset, qiov.size,
285
- &qiov, 0);
286
+ ret = blk_co_preadv(blk, offset, qiov.size, &qiov, 0);
287
}
288
if (ret >= 0) {
289
req->in->status = VIRTIO_BLK_S_OK;
290
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn vu_block_virtio_process_req(void *opaque)
291
break;
292
}
293
case VIRTIO_BLK_T_FLUSH:
294
- if (vu_block_flush(req) == 0) {
295
+ if (blk_co_flush(blk) == 0) {
296
req->in->status = VIRTIO_BLK_S_OK;
297
} else {
298
req->in->status = VIRTIO_BLK_S_IOERR;
299
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn vu_block_virtio_process_req(void *opaque)
300
case VIRTIO_BLK_T_DISCARD:
301
case VIRTIO_BLK_T_WRITE_ZEROES: {
302
int rc;
303
- rc = vu_block_discard_write_zeroes(req, &elem->out_sg[1],
304
- out_num, type);
65
+
305
+
66
/**
306
+ if (!vexp->writable) {
67
* GpioPolarity: Polarity of a GPIO line
307
+ req->in->status = VIRTIO_BLK_S_IOERR;
308
+ break;
309
+ }
310
+
311
+ rc = vu_blk_discard_write_zeroes(blk, &elem->out_sg[1], out_num, type);
312
if (rc == 0) {
313
req->in->status = VIRTIO_BLK_S_OK;
314
} else {
315
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn vu_block_virtio_process_req(void *opaque)
316
break;
317
}
318
319
- vu_block_req_complete(req);
320
+ vu_blk_req_complete(req);
321
return;
322
323
err:
324
- free(elem);
325
+ free(req);
326
}
327
328
-static void vu_block_process_vq(VuDev *vu_dev, int idx)
329
+static void vu_blk_process_vq(VuDev *vu_dev, int idx)
330
{
331
VuServer *server = container_of(vu_dev, VuServer, vu_dev);
332
VuVirtq *vq = vu_get_queue(vu_dev, idx);
333
334
while (1) {
335
- VuBlockReq *req;
336
+ VuBlkReq *req;
337
338
- req = vu_queue_pop(vu_dev, vq, sizeof(VuBlockReq));
339
+ req = vu_queue_pop(vu_dev, vq, sizeof(VuBlkReq));
340
if (!req) {
341
break;
342
}
343
@@ -XXX,XX +XXX,XX @@ static void vu_block_process_vq(VuDev *vu_dev, int idx)
344
req->vq = vq;
345
346
Coroutine *co =
347
- qemu_coroutine_create(vu_block_virtio_process_req, req);
348
+ qemu_coroutine_create(vu_blk_virtio_process_req, req);
349
qemu_coroutine_enter(co);
350
}
351
}
352
353
-static void vu_block_queue_set_started(VuDev *vu_dev, int idx, bool started)
354
+static void vu_blk_queue_set_started(VuDev *vu_dev, int idx, bool started)
355
{
356
VuVirtq *vq;
357
358
assert(vu_dev);
359
360
vq = vu_get_queue(vu_dev, idx);
361
- vu_set_queue_handler(vu_dev, vq, started ? vu_block_process_vq : NULL);
362
+ vu_set_queue_handler(vu_dev, vq, started ? vu_blk_process_vq : NULL);
363
}
364
365
-static uint64_t vu_block_get_features(VuDev *dev)
366
+static uint64_t vu_blk_get_features(VuDev *dev)
367
{
368
uint64_t features;
369
VuServer *server = container_of(dev, VuServer, vu_dev);
370
- VuBlockDev *vdev_blk = get_vu_block_device_by_server(server);
371
+ VuBlkExport *vexp = container_of(server, VuBlkExport, vu_server);
372
features = 1ull << VIRTIO_BLK_F_SIZE_MAX |
373
1ull << VIRTIO_BLK_F_SEG_MAX |
374
1ull << VIRTIO_BLK_F_TOPOLOGY |
375
@@ -XXX,XX +XXX,XX @@ static uint64_t vu_block_get_features(VuDev *dev)
376
1ull << VIRTIO_RING_F_EVENT_IDX |
377
1ull << VHOST_USER_F_PROTOCOL_FEATURES;
378
379
- if (!vdev_blk->writable) {
380
+ if (!vexp->writable) {
381
features |= 1ull << VIRTIO_BLK_F_RO;
382
}
383
384
return features;
385
}
386
387
-static uint64_t vu_block_get_protocol_features(VuDev *dev)
388
+static uint64_t vu_blk_get_protocol_features(VuDev *dev)
389
{
390
return 1ull << VHOST_USER_PROTOCOL_F_CONFIG |
391
1ull << VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD;
392
}
393
394
static int
395
-vu_block_get_config(VuDev *vu_dev, uint8_t *config, uint32_t len)
396
+vu_blk_get_config(VuDev *vu_dev, uint8_t *config, uint32_t len)
397
{
398
+ /* TODO blkcfg must be little-endian for VIRTIO 1.0 */
399
VuServer *server = container_of(vu_dev, VuServer, vu_dev);
400
- VuBlockDev *vdev_blk = get_vu_block_device_by_server(server);
401
- memcpy(config, &vdev_blk->blkcfg, len);
402
-
403
+ VuBlkExport *vexp = container_of(server, VuBlkExport, vu_server);
404
+ memcpy(config, &vexp->blkcfg, len);
405
return 0;
406
}
407
408
static int
409
-vu_block_set_config(VuDev *vu_dev, const uint8_t *data,
410
+vu_blk_set_config(VuDev *vu_dev, const uint8_t *data,
411
uint32_t offset, uint32_t size, uint32_t flags)
412
{
413
VuServer *server = container_of(vu_dev, VuServer, vu_dev);
414
- VuBlockDev *vdev_blk = get_vu_block_device_by_server(server);
415
+ VuBlkExport *vexp = container_of(server, VuBlkExport, vu_server);
416
uint8_t wce;
417
418
/* don't support live migration */
419
@@ -XXX,XX +XXX,XX @@ vu_block_set_config(VuDev *vu_dev, const uint8_t *data,
420
}
421
422
wce = *data;
423
- vdev_blk->blkcfg.wce = wce;
424
- blk_set_enable_write_cache(vdev_blk->backend, wce);
425
+ vexp->blkcfg.wce = wce;
426
+ blk_set_enable_write_cache(vexp->export.blk, wce);
427
return 0;
428
}
429
430
@@ -XXX,XX +XXX,XX @@ vu_block_set_config(VuDev *vu_dev, const uint8_t *data,
431
* of vu_process_message.
68
*
432
*
69
diff --git a/hw/core/qdev.c b/hw/core/qdev.c
433
*/
434
-static int vu_block_process_msg(VuDev *dev, VhostUserMsg *vmsg, int *do_reply)
435
+static int vu_blk_process_msg(VuDev *dev, VhostUserMsg *vmsg, int *do_reply)
436
{
437
if (vmsg->request == VHOST_USER_NONE) {
438
dev->panic(dev, "disconnect");
439
@@ -XXX,XX +XXX,XX @@ static int vu_block_process_msg(VuDev *dev, VhostUserMsg *vmsg, int *do_reply)
440
return false;
441
}
442
443
-static const VuDevIface vu_block_iface = {
444
- .get_features = vu_block_get_features,
445
- .queue_set_started = vu_block_queue_set_started,
446
- .get_protocol_features = vu_block_get_protocol_features,
447
- .get_config = vu_block_get_config,
448
- .set_config = vu_block_set_config,
449
- .process_msg = vu_block_process_msg,
450
+static const VuDevIface vu_blk_iface = {
451
+ .get_features = vu_blk_get_features,
452
+ .queue_set_started = vu_blk_queue_set_started,
453
+ .get_protocol_features = vu_blk_get_protocol_features,
454
+ .get_config = vu_blk_get_config,
455
+ .set_config = vu_blk_set_config,
456
+ .process_msg = vu_blk_process_msg,
457
};
458
459
static void blk_aio_attached(AioContext *ctx, void *opaque)
460
{
461
- VuBlockDev *vub_dev = opaque;
462
- vhost_user_server_attach_aio_context(&vub_dev->vu_server, ctx);
463
+ VuBlkExport *vexp = opaque;
464
+ vhost_user_server_attach_aio_context(&vexp->vu_server, ctx);
465
}
466
467
static void blk_aio_detach(void *opaque)
468
{
469
- VuBlockDev *vub_dev = opaque;
470
- vhost_user_server_detach_aio_context(&vub_dev->vu_server);
471
+ VuBlkExport *vexp = opaque;
472
+ vhost_user_server_detach_aio_context(&vexp->vu_server);
473
}
474
475
static void
476
-vu_block_initialize_config(BlockDriverState *bs,
477
+vu_blk_initialize_config(BlockDriverState *bs,
478
struct virtio_blk_config *config, uint32_t blk_size)
479
{
480
config->capacity = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
481
@@ -XXX,XX +XXX,XX @@ vu_block_initialize_config(BlockDriverState *bs,
482
config->max_write_zeroes_seg = 1;
483
}
484
485
-static VuBlockDev *vu_block_init(VuBlockDev *vu_block_device, Error **errp)
486
+static void vu_blk_exp_request_shutdown(BlockExport *exp)
487
{
488
+ VuBlkExport *vexp = container_of(exp, VuBlkExport, export);
489
490
- BlockBackend *blk;
491
- Error *local_error = NULL;
492
- const char *node_name = vu_block_device->node_name;
493
- bool writable = vu_block_device->writable;
494
- uint64_t perm = BLK_PERM_CONSISTENT_READ;
495
- int ret;
496
-
497
- AioContext *ctx;
498
-
499
- BlockDriverState *bs = bdrv_lookup_bs(node_name, node_name, &local_error);
500
-
501
- if (!bs) {
502
- error_propagate(errp, local_error);
503
- return NULL;
504
- }
505
-
506
- if (bdrv_is_read_only(bs)) {
507
- writable = false;
508
- }
509
-
510
- if (writable) {
511
- perm |= BLK_PERM_WRITE;
512
- }
513
-
514
- ctx = bdrv_get_aio_context(bs);
515
- aio_context_acquire(ctx);
516
- bdrv_invalidate_cache(bs, NULL);
517
- aio_context_release(ctx);
518
-
519
- /*
520
- * Don't allow resize while the vhost user server is running,
521
- * otherwise we don't care what happens with the node.
522
- */
523
- blk = blk_new(bdrv_get_aio_context(bs), perm,
524
- BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE_UNCHANGED |
525
- BLK_PERM_WRITE | BLK_PERM_GRAPH_MOD);
526
- ret = blk_insert_bs(blk, bs, errp);
527
-
528
- if (ret < 0) {
529
- goto fail;
530
- }
531
-
532
- blk_set_enable_write_cache(blk, false);
533
-
534
- blk_set_allow_aio_context_change(blk, true);
535
-
536
- vu_block_device->blkcfg.wce = 0;
537
- vu_block_device->backend = blk;
538
- if (!vu_block_device->blk_size) {
539
- vu_block_device->blk_size = BDRV_SECTOR_SIZE;
540
- }
541
- vu_block_device->blkcfg.blk_size = vu_block_device->blk_size;
542
- blk_set_guest_block_size(blk, vu_block_device->blk_size);
543
- vu_block_initialize_config(bs, &vu_block_device->blkcfg,
544
- vu_block_device->blk_size);
545
- return vu_block_device;
546
-
547
-fail:
548
- blk_unref(blk);
549
- return NULL;
550
-}
551
-
552
-static void vu_block_deinit(VuBlockDev *vu_block_device)
553
-{
554
- if (vu_block_device->backend) {
555
- blk_remove_aio_context_notifier(vu_block_device->backend, blk_aio_attached,
556
- blk_aio_detach, vu_block_device);
557
- }
558
-
559
- blk_unref(vu_block_device->backend);
560
-}
561
-
562
-static void vhost_user_blk_server_stop(VuBlockDev *vu_block_device)
563
-{
564
- vhost_user_server_stop(&vu_block_device->vu_server);
565
- vu_block_deinit(vu_block_device);
566
-}
567
-
568
-static void vhost_user_blk_server_start(VuBlockDev *vu_block_device,
569
- Error **errp)
570
-{
571
- AioContext *ctx;
572
- SocketAddress *addr = vu_block_device->addr;
573
-
574
- if (!vu_block_init(vu_block_device, errp)) {
575
- return;
576
- }
577
-
578
- ctx = bdrv_get_aio_context(blk_bs(vu_block_device->backend));
579
-
580
- if (!vhost_user_server_start(&vu_block_device->vu_server, addr, ctx,
581
- VHOST_USER_BLK_MAX_QUEUES, &vu_block_iface,
582
- errp)) {
583
- goto error;
584
- }
585
-
586
- blk_add_aio_context_notifier(vu_block_device->backend, blk_aio_attached,
587
- blk_aio_detach, vu_block_device);
588
- vu_block_device->running = true;
589
- return;
590
-
591
- error:
592
- vu_block_deinit(vu_block_device);
593
-}
594
-
595
-static bool vu_prop_modifiable(VuBlockDev *vus, Error **errp)
596
-{
597
- if (vus->running) {
598
- error_setg(errp, "The property can't be modified "
599
- "while the server is running");
600
- return false;
601
- }
602
- return true;
603
-}
604
-
605
-static void vu_set_node_name(Object *obj, const char *value, Error **errp)
606
-{
607
- VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
608
-
609
- if (!vu_prop_modifiable(vus, errp)) {
610
- return;
611
- }
612
-
613
- if (vus->node_name) {
614
- g_free(vus->node_name);
615
- }
616
-
617
- vus->node_name = g_strdup(value);
618
-}
619
-
620
-static char *vu_get_node_name(Object *obj, Error **errp)
621
-{
622
- VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
623
- return g_strdup(vus->node_name);
624
-}
625
-
626
-static void free_socket_addr(SocketAddress *addr)
627
-{
628
- g_free(addr->u.q_unix.path);
629
- g_free(addr);
630
-}
631
-
632
-static void vu_set_unix_socket(Object *obj, const char *value,
633
- Error **errp)
634
-{
635
- VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
636
-
637
- if (!vu_prop_modifiable(vus, errp)) {
638
- return;
639
- }
640
-
641
- if (vus->addr) {
642
- free_socket_addr(vus->addr);
643
- }
644
-
645
- SocketAddress *addr = g_new0(SocketAddress, 1);
646
- addr->type = SOCKET_ADDRESS_TYPE_UNIX;
647
- addr->u.q_unix.path = g_strdup(value);
648
- vus->addr = addr;
649
+ vhost_user_server_stop(&vexp->vu_server);
650
}
651
652
-static char *vu_get_unix_socket(Object *obj, Error **errp)
653
+static int vu_blk_exp_create(BlockExport *exp, BlockExportOptions *opts,
654
+ Error **errp)
655
{
656
- VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
657
- return g_strdup(vus->addr->u.q_unix.path);
658
-}
659
-
660
-static bool vu_get_block_writable(Object *obj, Error **errp)
661
-{
662
- VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
663
- return vus->writable;
664
-}
665
-
666
-static void vu_set_block_writable(Object *obj, bool value, Error **errp)
667
-{
668
- VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
669
-
670
- if (!vu_prop_modifiable(vus, errp)) {
671
- return;
672
- }
673
-
674
- vus->writable = value;
675
-}
676
-
677
-static void vu_get_blk_size(Object *obj, Visitor *v, const char *name,
678
- void *opaque, Error **errp)
679
-{
680
- VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
681
- uint32_t value = vus->blk_size;
682
-
683
- visit_type_uint32(v, name, &value, errp);
684
-}
685
-
686
-static void vu_set_blk_size(Object *obj, Visitor *v, const char *name,
687
- void *opaque, Error **errp)
688
-{
689
- VuBlockDev *vus = VHOST_USER_BLK_SERVER(obj);
690
-
691
+ VuBlkExport *vexp = container_of(exp, VuBlkExport, export);
692
+ BlockExportOptionsVhostUserBlk *vu_opts = &opts->u.vhost_user_blk;
693
Error *local_err = NULL;
694
- uint32_t value;
695
+ uint64_t logical_block_size;
696
697
- if (!vu_prop_modifiable(vus, errp)) {
698
- return;
699
- }
700
+ vexp->writable = opts->writable;
701
+ vexp->blkcfg.wce = 0;
702
703
- visit_type_uint32(v, name, &value, &local_err);
704
- if (local_err) {
705
- goto out;
706
+ if (vu_opts->has_logical_block_size) {
707
+ logical_block_size = vu_opts->logical_block_size;
708
+ } else {
709
+ logical_block_size = BDRV_SECTOR_SIZE;
710
}
711
-
712
- check_block_size(object_get_typename(obj), name, value, &local_err);
713
+ check_block_size(exp->id, "logical-block-size", logical_block_size,
714
+ &local_err);
715
if (local_err) {
716
- goto out;
717
+ error_propagate(errp, local_err);
718
+ return -EINVAL;
719
+ }
720
+ vexp->blk_size = logical_block_size;
721
+ blk_set_guest_block_size(exp->blk, logical_block_size);
722
+ vu_blk_initialize_config(blk_bs(exp->blk), &vexp->blkcfg,
723
+ logical_block_size);
724
+
725
+ blk_set_allow_aio_context_change(exp->blk, true);
726
+ blk_add_aio_context_notifier(exp->blk, blk_aio_attached, blk_aio_detach,
727
+ vexp);
728
+
729
+ if (!vhost_user_server_start(&vexp->vu_server, vu_opts->addr, exp->ctx,
730
+ VHOST_USER_BLK_MAX_QUEUES, &vu_blk_iface,
731
+ errp)) {
732
+ blk_remove_aio_context_notifier(exp->blk, blk_aio_attached,
733
+ blk_aio_detach, vexp);
734
+ return -EADDRNOTAVAIL;
735
}
736
737
- vus->blk_size = value;
738
-
739
-out:
740
- error_propagate(errp, local_err);
741
-}
742
-
743
-static void vhost_user_blk_server_instance_finalize(Object *obj)
744
-{
745
- VuBlockDev *vub = VHOST_USER_BLK_SERVER(obj);
746
-
747
- vhost_user_blk_server_stop(vub);
748
-
749
- /*
750
- * Unlike object_property_add_str, object_class_property_add_str
751
- * doesn't have a release method. Thus manual memory freeing is
752
- * needed.
753
- */
754
- free_socket_addr(vub->addr);
755
- g_free(vub->node_name);
756
-}
757
-
758
-static void vhost_user_blk_server_complete(UserCreatable *obj, Error **errp)
759
-{
760
- VuBlockDev *vub = VHOST_USER_BLK_SERVER(obj);
761
-
762
- vhost_user_blk_server_start(vub, errp);
763
+ return 0;
764
}
765
766
-static void vhost_user_blk_server_class_init(ObjectClass *klass,
767
- void *class_data)
768
+static void vu_blk_exp_delete(BlockExport *exp)
769
{
770
- UserCreatableClass *ucc = USER_CREATABLE_CLASS(klass);
771
- ucc->complete = vhost_user_blk_server_complete;
772
-
773
- object_class_property_add_bool(klass, "writable",
774
- vu_get_block_writable,
775
- vu_set_block_writable);
776
-
777
- object_class_property_add_str(klass, "node-name",
778
- vu_get_node_name,
779
- vu_set_node_name);
780
-
781
- object_class_property_add_str(klass, "unix-socket",
782
- vu_get_unix_socket,
783
- vu_set_unix_socket);
784
+ VuBlkExport *vexp = container_of(exp, VuBlkExport, export);
785
786
- object_class_property_add(klass, "logical-block-size", "uint32",
787
- vu_get_blk_size, vu_set_blk_size,
788
- NULL, NULL);
789
+ blk_remove_aio_context_notifier(exp->blk, blk_aio_attached, blk_aio_detach,
790
+ vexp);
791
}
792
793
-static const TypeInfo vhost_user_blk_server_info = {
794
- .name = TYPE_VHOST_USER_BLK_SERVER,
795
- .parent = TYPE_OBJECT,
796
- .instance_size = sizeof(VuBlockDev),
797
- .instance_finalize = vhost_user_blk_server_instance_finalize,
798
- .class_init = vhost_user_blk_server_class_init,
799
- .interfaces = (InterfaceInfo[]) {
800
- {TYPE_USER_CREATABLE},
801
- {}
802
- },
803
+const BlockExportDriver blk_exp_vhost_user_blk = {
804
+ .type = BLOCK_EXPORT_TYPE_VHOST_USER_BLK,
805
+ .instance_size = sizeof(VuBlkExport),
806
+ .create = vu_blk_exp_create,
807
+ .delete = vu_blk_exp_delete,
808
+ .request_shutdown = vu_blk_exp_request_shutdown,
809
};
810
-
811
-static void vhost_user_blk_server_register_types(void)
812
-{
813
- type_register_static(&vhost_user_blk_server_info);
814
-}
815
-
816
-type_init(vhost_user_blk_server_register_types)
817
diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
70
index XXXXXXX..XXXXXXX 100644
818
index XXXXXXX..XXXXXXX 100644
71
--- a/hw/core/qdev.c
819
--- a/util/vhost-user-server.c
72
+++ b/hw/core/qdev.c
820
+++ b/util/vhost-user-server.c
73
@@ -XXX,XX +XXX,XX @@ char *qdev_get_dev_path(DeviceState *dev)
821
@@ -XXX,XX +XXX,XX @@ bool vhost_user_server_start(VuServer *server,
74
return NULL;
822
Error **errp)
75
}
823
{
76
824
QEMUBH *bh;
77
+void qdev_add_unplug_blocker(DeviceState *dev, Error *reason)
825
- QIONetListener *listener = qio_net_listener_new();
78
+{
826
+ QIONetListener *listener;
79
+ dev->unplug_blockers = g_slist_prepend(dev->unplug_blockers, reason);
80
+}
81
+
827
+
82
+void qdev_del_unplug_blocker(DeviceState *dev, Error *reason)
828
+ if (socket_addr->type != SOCKET_ADDRESS_TYPE_UNIX &&
83
+{
829
+ socket_addr->type != SOCKET_ADDRESS_TYPE_FD) {
84
+ dev->unplug_blockers = g_slist_remove(dev->unplug_blockers, reason);
830
+ error_setg(errp, "Only socket address types 'unix' and 'fd' are supported");
85
+}
831
+ return false;
86
+
87
+bool qdev_unplug_blocked(DeviceState *dev, Error **errp)
88
+{
89
+ ERRP_GUARD();
90
+
91
+ if (dev->unplug_blockers) {
92
+ error_propagate(errp, error_copy(dev->unplug_blockers->data));
93
+ return true;
94
+ }
832
+ }
95
+
833
+
96
+ return false;
834
+ listener = qio_net_listener_new();
97
+}
835
if (qio_net_listener_open_sync(listener, socket_addr, 1,
98
+
836
errp) < 0) {
99
static bool device_get_realized(Object *obj, Error **errp)
837
object_unref(OBJECT(listener));
100
{
838
diff --git a/block/export/meson.build b/block/export/meson.build
101
DeviceState *dev = DEVICE(obj);
102
@@ -XXX,XX +XXX,XX @@ static void device_finalize(Object *obj)
103
104
DeviceState *dev = DEVICE(obj);
105
106
+ g_assert(!dev->unplug_blockers);
107
+
108
QLIST_FOREACH_SAFE(ngl, &dev->gpios, node, next) {
109
QLIST_REMOVE(ngl, node);
110
qemu_free_irqs(ngl->in, ngl->num_in);
111
diff --git a/softmmu/qdev-monitor.c b/softmmu/qdev-monitor.c
112
index XXXXXXX..XXXXXXX 100644
839
index XXXXXXX..XXXXXXX 100644
113
--- a/softmmu/qdev-monitor.c
840
--- a/block/export/meson.build
114
+++ b/softmmu/qdev-monitor.c
841
+++ b/block/export/meson.build
115
@@ -XXX,XX +XXX,XX @@ void qdev_unplug(DeviceState *dev, Error **errp)
842
@@ -1 +1,2 @@
116
HotplugHandlerClass *hdc;
843
block_ss.add(files('export.c'))
117
Error *local_err = NULL;
844
+block_ss.add(when: 'CONFIG_LINUX', if_true: files('vhost-user-blk-server.c', '../../contrib/libvhost-user/libvhost-user.c'))
118
845
diff --git a/block/meson.build b/block/meson.build
119
+ if (qdev_unplug_blocked(dev, errp)) {
846
index XXXXXXX..XXXXXXX 100644
120
+ return;
847
--- a/block/meson.build
121
+ }
848
+++ b/block/meson.build
122
+
849
@@ -XXX,XX +XXX,XX @@ block_ss.add(when: 'CONFIG_WIN32', if_true: files('file-win32.c', 'win32-aio.c')
123
if (dev->parent_bus && !qbus_is_hotpluggable(dev->parent_bus)) {
850
block_ss.add(when: 'CONFIG_POSIX', if_true: [files('file-posix.c'), coref, iokit])
124
error_setg(errp, QERR_BUS_NO_HOTPLUG, dev->parent_bus->name);
851
block_ss.add(when: 'CONFIG_LIBISCSI', if_true: files('iscsi-opts.c'))
125
return;
852
block_ss.add(when: 'CONFIG_LINUX', if_true: files('nvme.c'))
853
-block_ss.add(when: 'CONFIG_LINUX', if_true: files('export/vhost-user-blk-server.c', '../contrib/libvhost-user/libvhost-user.c'))
854
block_ss.add(when: 'CONFIG_REPLICATION', if_true: files('replication.c'))
855
block_ss.add(when: 'CONFIG_SHEEPDOG', if_true: files('sheepdog.c'))
856
block_ss.add(when: ['CONFIG_LINUX_AIO', libaio], if_true: files('linux-aio.c'))
126
--
857
--
127
2.36.1
858
2.26.2
859
diff view generated by jsdifflib
1
From: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
1
Headers used by other subsystems are located in include/. Also add the
2
vhost-user-server and vhost-user-blk-server headers to MAINTAINERS.
2
3
3
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
4
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
4
Message-id: 20220526115432.138384-1-vsementsov@yandex-team.ru
5
Message-id: 20200924151549.913737-13-stefanha@redhat.com
5
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
6
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
6
---
7
---
7
MAINTAINERS | 22 ++++++++++++----------
8
MAINTAINERS | 4 +++-
8
1 file changed, 12 insertions(+), 10 deletions(-)
9
{util => include/qemu}/vhost-user-server.h | 0
10
block/export/vhost-user-blk-server.c | 2 +-
11
util/vhost-user-server.c | 2 +-
12
4 files changed, 5 insertions(+), 3 deletions(-)
13
rename {util => include/qemu}/vhost-user-server.h (100%)
9
14
10
diff --git a/MAINTAINERS b/MAINTAINERS
15
diff --git a/MAINTAINERS b/MAINTAINERS
11
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
12
--- a/MAINTAINERS
17
--- a/MAINTAINERS
13
+++ b/MAINTAINERS
18
+++ b/MAINTAINERS
14
@@ -XXX,XX +XXX,XX @@ F: scsi/*
19
@@ -XXX,XX +XXX,XX @@ Vhost-user block device backend server
15
20
M: Coiby Xu <Coiby.Xu@gmail.com>
16
Block Jobs
17
M: John Snow <jsnow@redhat.com>
18
-M: Vladimir Sementsov-Ogievskiy <v.sementsov-og@mail.ru>
19
+M: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
20
L: qemu-block@nongnu.org
21
S: Supported
22
F: blockjob.c
23
@@ -XXX,XX +XXX,XX @@ F: block/aio_task.c
24
F: util/qemu-co-shared-resource.c
25
F: include/qemu/co-shared-resource.h
26
T: git https://gitlab.com/jsnow/qemu.git jobs
27
-T: git https://src.openvz.org/scm/~vsementsov/qemu.git jobs
28
+T: git https://gitlab.com/vsementsov/qemu.git block
29
30
Block QAPI, monitor, command line
31
M: Markus Armbruster <armbru@redhat.com>
32
@@ -XXX,XX +XXX,XX @@ F: include/hw/cxl/
33
34
Dirty Bitmaps
35
M: Eric Blake <eblake@redhat.com>
36
-M: Vladimir Sementsov-Ogievskiy <v.sementsov-og@mail.ru>
37
+M: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
38
R: John Snow <jsnow@redhat.com>
39
L: qemu-block@nongnu.org
40
S: Supported
41
@@ -XXX,XX +XXX,XX @@ F: util/hbitmap.c
42
F: tests/unit/test-hbitmap.c
43
F: docs/interop/bitmaps.rst
44
T: git https://repo.or.cz/qemu/ericb.git bitmaps
45
+T: git https://gitlab.com/vsementsov/qemu.git block
46
47
Character device backends
48
M: Marc-André Lureau <marcandre.lureau@redhat.com>
49
@@ -XXX,XX +XXX,XX @@ F: scripts/*.py
50
F: tests/*.py
51
52
Benchmark util
53
-M: Vladimir Sementsov-Ogievskiy <v.sementsov-og@mail.ru>
54
+M: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
55
S: Maintained
21
S: Maintained
56
F: scripts/simplebench/
22
F: block/export/vhost-user-blk-server.c
57
-T: git https://src.openvz.org/scm/~vsementsov/qemu.git simplebench
23
-F: util/vhost-user-server.c
58
+T: git https://gitlab.com/vsementsov/qemu.git simplebench
24
+F: block/export/vhost-user-blk-server.h
59
25
+F: include/qemu/vhost-user-server.h
60
Transactions helper
26
F: tests/qtest/libqos/vhost-user-blk.c
61
-M: Vladimir Sementsov-Ogievskiy <v.sementsov-og@mail.ru>
27
+F: util/vhost-user-server.c
62
+M: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
28
63
S: Maintained
29
Replication
64
F: include/qemu/transactions.h
30
M: Wen Congyang <wencongyang2@huawei.com>
65
F: util/transactions.c
31
diff --git a/util/vhost-user-server.h b/include/qemu/vhost-user-server.h
66
+T: git https://gitlab.com/vsementsov/qemu.git block
32
similarity index 100%
67
33
rename from util/vhost-user-server.h
68
QAPI
34
rename to include/qemu/vhost-user-server.h
69
M: Markus Armbruster <armbru@redhat.com>
35
diff --git a/block/export/vhost-user-blk-server.c b/block/export/vhost-user-blk-server.c
70
@@ -XXX,XX +XXX,XX @@ F: block/iscsi-opts.c
36
index XXXXXXX..XXXXXXX 100644
71
37
--- a/block/export/vhost-user-blk-server.c
72
Network Block Device (NBD)
38
+++ b/block/export/vhost-user-blk-server.c
73
M: Eric Blake <eblake@redhat.com>
39
@@ -XXX,XX +XXX,XX @@
74
-M: Vladimir Sementsov-Ogievskiy <v.sementsov-og@mail.ru>
40
#include "block/block.h"
75
+M: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
41
#include "contrib/libvhost-user/libvhost-user.h"
76
L: qemu-block@nongnu.org
42
#include "standard-headers/linux/virtio_blk.h"
77
S: Maintained
43
-#include "util/vhost-user-server.h"
78
F: block/nbd*
44
+#include "qemu/vhost-user-server.h"
79
@@ -XXX,XX +XXX,XX @@ F: docs/interop/nbd.txt
45
#include "vhost-user-blk-server.h"
80
F: docs/tools/qemu-nbd.rst
46
#include "qapi/error.h"
81
F: tests/qemu-iotests/tests/*nbd*
47
#include "qom/object_interfaces.h"
82
T: git https://repo.or.cz/qemu/ericb.git nbd
48
diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
83
-T: git https://src.openvz.org/scm/~vsementsov/qemu.git nbd
49
index XXXXXXX..XXXXXXX 100644
84
+T: git https://gitlab.com/vsementsov/qemu.git block
50
--- a/util/vhost-user-server.c
85
51
+++ b/util/vhost-user-server.c
86
NFS
52
@@ -XXX,XX +XXX,XX @@
87
M: Peter Lieven <pl@kamp.de>
53
*/
88
@@ -XXX,XX +XXX,XX @@ F: block/dmg.c
54
#include "qemu/osdep.h"
89
parallels
55
#include "qemu/main-loop.h"
90
M: Stefan Hajnoczi <stefanha@redhat.com>
56
+#include "qemu/vhost-user-server.h"
91
M: Denis V. Lunev <den@openvz.org>
57
#include "block/aio-wait.h"
92
-M: Vladimir Sementsov-Ogievskiy <v.sementsov-og@mail.ru>
58
-#include "vhost-user-server.h"
93
+M: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
59
94
L: qemu-block@nongnu.org
60
/*
95
S: Supported
61
* Theory of operation:
96
F: block/parallels.c
97
F: block/parallels-ext.c
98
F: docs/interop/parallels.txt
99
-T: git https://src.openvz.org/scm/~vsementsov/qemu.git parallels
100
+T: git https://gitlab.com/vsementsov/qemu.git block
101
102
qed
103
M: Stefan Hajnoczi <stefanha@redhat.com>
104
--
62
--
105
2.36.1
63
2.26.2
106
64
107
diff view generated by jsdifflib
1
From: Jagannathan Raman <jag.raman@oracle.com>
1
Don't compile contrib/libvhost-user/libvhost-user.c again. Instead build
2
the static library once and then reuse it throughout QEMU.
2
3
3
add the libvfio-user library as a submodule. build it as a meson
4
Also switch from CONFIG_LINUX to CONFIG_VHOST_USER, which is what the
4
subproject.
5
vhost-user tools (vhost-user-gpu, etc) do.
5
6
libvfio-user is distributed with BSD 3-Clause license and
7
json-c with MIT (Expat) license
8
9
Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
10
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
11
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
12
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
13
Message-id: c2adec87958b081d1dc8775d4aa05c897912f025.1655151679.git.jag.raman@oracle.com
14
15
[Changed submodule URL to QEMU's libvfio-user mirror on GitLab. The QEMU
16
project mirrors its dependencies so that it can provide full source code
17
even in the event that its dependencies become unavailable. Note that
18
the mirror repo is manually updated, so please contact me to make newer
19
libvfio-user commits available. If I become a bottleneck we can set up a
20
cronjob.
21
22
Updated scripts/meson-buildoptions.sh to match the meson_options.txt
23
change. Failure to do so can result in scripts/meson-buildoptions.sh
24
being modified by the build system later on and you end up with a dirty
25
working tree.
26
--Stefan]
27
6
28
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
7
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
8
Message-id: 20200924151549.913737-14-stefanha@redhat.com
9
[Added CONFIG_LINUX again because libvhost-user doesn't build on macOS.
10
--Stefan]
11
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
29
---
12
---
30
MAINTAINERS | 1 +
13
block/export/export.c | 8 ++++----
31
meson_options.txt | 2 ++
14
block/export/meson.build | 2 +-
32
configure | 17 +++++++++++++++++
15
contrib/libvhost-user/meson.build | 1 +
33
meson.build | 23 ++++++++++++++++++++++-
16
meson.build | 6 +++++-
34
.gitlab-ci.d/buildtest.yml | 1 +
17
util/meson.build | 4 +++-
35
.gitmodules | 3 +++
18
5 files changed, 14 insertions(+), 7 deletions(-)
36
Kconfig.host | 4 ++++
37
hw/remote/Kconfig | 4 ++++
38
hw/remote/meson.build | 2 ++
39
scripts/meson-buildoptions.sh | 4 ++++
40
subprojects/libvfio-user | 1 +
41
tests/docker/dockerfiles/centos8.docker | 2 ++
42
12 files changed, 63 insertions(+), 1 deletion(-)
43
create mode 160000 subprojects/libvfio-user
44
19
45
diff --git a/MAINTAINERS b/MAINTAINERS
20
diff --git a/block/export/export.c b/block/export/export.c
46
index XXXXXXX..XXXXXXX 100644
21
index XXXXXXX..XXXXXXX 100644
47
--- a/MAINTAINERS
22
--- a/block/export/export.c
48
+++ b/MAINTAINERS
23
+++ b/block/export/export.c
49
@@ -XXX,XX +XXX,XX @@ F: hw/remote/proxy-memory-listener.c
24
@@ -XXX,XX +XXX,XX @@
50
F: include/hw/remote/proxy-memory-listener.h
25
#include "sysemu/block-backend.h"
51
F: hw/remote/iohub.c
26
#include "block/export.h"
52
F: include/hw/remote/iohub.h
27
#include "block/nbd.h"
53
+F: subprojects/libvfio-user
28
-#if CONFIG_LINUX
54
29
-#include "block/export/vhost-user-blk-server.h"
55
EBPF:
30
-#endif
56
M: Jason Wang <jasowang@redhat.com>
31
#include "qapi/error.h"
57
diff --git a/meson_options.txt b/meson_options.txt
32
#include "qapi/qapi-commands-block-export.h"
33
#include "qapi/qapi-events-block-export.h"
34
#include "qemu/id.h"
35
+#ifdef CONFIG_VHOST_USER
36
+#include "vhost-user-blk-server.h"
37
+#endif
38
39
static const BlockExportDriver *blk_exp_drivers[] = {
40
&blk_exp_nbd,
41
-#if CONFIG_LINUX
42
+#ifdef CONFIG_VHOST_USER
43
&blk_exp_vhost_user_blk,
44
#endif
45
};
46
diff --git a/block/export/meson.build b/block/export/meson.build
58
index XXXXXXX..XXXXXXX 100644
47
index XXXXXXX..XXXXXXX 100644
59
--- a/meson_options.txt
48
--- a/block/export/meson.build
60
+++ b/meson_options.txt
49
+++ b/block/export/meson.build
61
@@ -XXX,XX +XXX,XX @@ option('cfi_debug', type: 'boolean', value: 'false',
50
@@ -XXX,XX +XXX,XX @@
62
description: 'Verbose errors in case of CFI violation')
51
block_ss.add(files('export.c'))
63
option('multiprocess', type: 'feature', value: 'auto',
52
-block_ss.add(when: 'CONFIG_LINUX', if_true: files('vhost-user-blk-server.c', '../../contrib/libvhost-user/libvhost-user.c'))
64
description: 'Out of process device emulation support')
53
+block_ss.add(when: ['CONFIG_LINUX', 'CONFIG_VHOST_USER'], if_true: files('vhost-user-blk-server.c'))
65
+option('vfio_user_server', type: 'feature', value: 'disabled',
54
diff --git a/contrib/libvhost-user/meson.build b/contrib/libvhost-user/meson.build
66
+ description: 'vfio-user server support')
55
index XXXXXXX..XXXXXXX 100644
67
option('dbus_display', type: 'feature', value: 'auto',
56
--- a/contrib/libvhost-user/meson.build
68
description: '-display dbus support')
57
+++ b/contrib/libvhost-user/meson.build
69
option('tpm', type : 'feature', value : 'auto',
58
@@ -XXX,XX +XXX,XX @@
70
diff --git a/configure b/configure
59
libvhost_user = static_library('vhost-user',
71
index XXXXXXX..XXXXXXX 100755
60
files('libvhost-user.c', 'libvhost-user-glib.c'),
72
--- a/configure
61
build_by_default: false)
73
+++ b/configure
62
+vhost_user = declare_dependency(link_with: libvhost_user)
74
@@ -XXX,XX +XXX,XX @@ meson_args=""
75
ninja=""
76
bindir="bin"
77
skip_meson=no
78
+vfio_user_server="disabled"
79
80
# The following Meson options are handled manually (still they
81
# are included in the automatically generated help message)
82
@@ -XXX,XX +XXX,XX @@ for opt do
83
;;
84
--disable-blobs) meson_option_parse --disable-install-blobs ""
85
;;
86
+ --enable-vfio-user-server) vfio_user_server="enabled"
87
+ ;;
88
+ --disable-vfio-user-server) vfio_user_server="disabled"
89
+ ;;
90
--enable-tcmalloc) meson_option_parse --enable-malloc=tcmalloc tcmalloc
91
;;
92
--enable-jemalloc) meson_option_parse --enable-malloc=jemalloc jemalloc
93
@@ -XXX,XX +XXX,XX @@ write_container_target_makefile() {
94
95
96
97
+##########################################
98
+# check for vfio_user_server
99
+
100
+case "$vfio_user_server" in
101
+ enabled )
102
+ if test "$git_submodules_action" != "ignore"; then
103
+ git_submodules="${git_submodules} subprojects/libvfio-user"
104
+ fi
105
+ ;;
106
+esac
107
+
108
##########################################
109
# End of CC checks
110
# After here, no more $cc or $ld runs
111
@@ -XXX,XX +XXX,XX @@ if test "$skip_meson" = no; then
112
test "$slirp" != auto && meson_option_add "-Dslirp=$slirp"
113
test "$smbd" != '' && meson_option_add "-Dsmbd=$smbd"
114
test "$tcg" != enabled && meson_option_add "-Dtcg=$tcg"
115
+ test "$vfio_user_server" != auto && meson_option_add "-Dvfio_user_server=$vfio_user_server"
116
run_meson() {
117
NINJA=$ninja $meson setup --prefix "$prefix" "$@" $cross_arg "$PWD" "$source_path"
118
}
119
diff --git a/meson.build b/meson.build
63
diff --git a/meson.build b/meson.build
120
index XXXXXXX..XXXXXXX 100644
64
index XXXXXXX..XXXXXXX 100644
121
--- a/meson.build
65
--- a/meson.build
122
+++ b/meson.build
66
+++ b/meson.build
123
@@ -XXX,XX +XXX,XX @@ multiprocess_allowed = get_option('multiprocess') \
67
@@ -XXX,XX +XXX,XX @@ trace_events_subdirs += [
124
.require(targetos == 'linux', error_message: 'Multiprocess QEMU is supported only on Linux') \
68
'util',
125
.allowed()
69
]
126
70
127
+vfio_user_server_allowed = get_option('vfio_user_server') \
71
+vhost_user = not_found
128
+ .require(targetos == 'linux', error_message: 'vfio-user server is supported only on Linux') \
72
+if 'CONFIG_VHOST_USER' in config_host
129
+ .allowed()
73
+ subdir('contrib/libvhost-user')
130
+
131
have_tpm = get_option('tpm') \
132
.require(targetos != 'windows', error_message: 'TPM emulation only available on POSIX systems') \
133
.allowed()
134
@@ -XXX,XX +XXX,XX @@ host_kconfig = \
135
(have_virtfs ? ['CONFIG_VIRTFS=y'] : []) + \
136
('CONFIG_LINUX' in config_host ? ['CONFIG_LINUX=y'] : []) + \
137
(have_pvrdma ? ['CONFIG_PVRDMA=y'] : []) + \
138
- (multiprocess_allowed ? ['CONFIG_MULTIPROCESS_ALLOWED=y'] : [])
139
+ (multiprocess_allowed ? ['CONFIG_MULTIPROCESS_ALLOWED=y'] : []) + \
140
+ (vfio_user_server_allowed ? ['CONFIG_VFIO_USER_SERVER_ALLOWED=y'] : [])
141
142
ignored = [ 'TARGET_XML_FILES', 'TARGET_ABI_DIR', 'TARGET_ARCH' ]
143
144
@@ -XXX,XX +XXX,XX @@ if have_system
145
endif
146
endif
147
148
+libvfio_user_dep = not_found
149
+if have_system and vfio_user_server_allowed
150
+ have_internal = fs.exists(meson.current_source_dir() / 'subprojects/libvfio-user/meson.build')
151
+
152
+ if not have_internal
153
+ error('libvfio-user source not found - please pull git submodule')
154
+ endif
155
+
156
+ libvfio_user_proj = subproject('libvfio-user')
157
+
158
+ libvfio_user_lib = libvfio_user_proj.get_variable('libvfio_user_dep')
159
+
160
+ libvfio_user_dep = declare_dependency(dependencies: [libvfio_user_lib])
161
+endif
74
+endif
162
+
75
+
163
fdt = not_found
76
subdir('qapi')
164
if have_system
77
subdir('qobject')
165
fdt_opt = get_option('fdt')
78
subdir('stubs')
166
@@ -XXX,XX +XXX,XX @@ summary_info += {'target list': ' '.join(target_dirs)}
79
@@ -XXX,XX +XXX,XX @@ if have_tools
167
if have_system
80
install: true)
168
summary_info += {'default devices': get_option('default_devices')}
81
169
summary_info += {'out of process emulation': multiprocess_allowed}
82
if 'CONFIG_VHOST_USER' in config_host
170
+ summary_info += {'vfio-user server': vfio_user_server_allowed}
83
- subdir('contrib/libvhost-user')
171
endif
84
subdir('contrib/vhost-user-blk')
172
summary(summary_info, bool_yn: true, section: 'Targets and accelerators')
85
subdir('contrib/vhost-user-gpu')
173
86
subdir('contrib/vhost-user-input')
174
diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml
87
diff --git a/util/meson.build b/util/meson.build
175
index XXXXXXX..XXXXXXX 100644
88
index XXXXXXX..XXXXXXX 100644
176
--- a/.gitlab-ci.d/buildtest.yml
89
--- a/util/meson.build
177
+++ b/.gitlab-ci.d/buildtest.yml
90
+++ b/util/meson.build
178
@@ -XXX,XX +XXX,XX @@ build-system-centos:
91
@@ -XXX,XX +XXX,XX @@ if have_block
179
IMAGE: centos8
92
util_ss.add(files('main-loop.c'))
180
CONFIGURE_ARGS: --disable-nettle --enable-gcrypt --enable-fdt=system
93
util_ss.add(files('nvdimm-utils.c'))
181
--enable-modules --enable-trace-backends=dtrace --enable-docs
94
util_ss.add(files('qemu-coroutine.c', 'qemu-coroutine-lock.c', 'qemu-coroutine-io.c'))
182
+ --enable-vfio-user-server
95
- util_ss.add(when: 'CONFIG_LINUX', if_true: files('vhost-user-server.c'))
183
TARGETS: ppc64-softmmu or1k-softmmu s390x-softmmu
96
+ util_ss.add(when: ['CONFIG_LINUX', 'CONFIG_VHOST_USER'], if_true: [
184
x86_64-softmmu rx-softmmu sh4-softmmu nios2-softmmu
97
+ files('vhost-user-server.c'), vhost_user
185
MAKE_CHECK_ARGS: check-build
98
+ ])
186
diff --git a/.gitmodules b/.gitmodules
99
util_ss.add(files('block-helpers.c'))
187
index XXXXXXX..XXXXXXX 100644
100
util_ss.add(files('qemu-coroutine-sleep.c'))
188
--- a/.gitmodules
101
util_ss.add(files('qemu-co-shared-resource.c'))
189
+++ b/.gitmodules
190
@@ -XXX,XX +XXX,XX @@
191
[submodule "tests/lcitool/libvirt-ci"]
192
    path = tests/lcitool/libvirt-ci
193
    url = https://gitlab.com/libvirt/libvirt-ci.git
194
+[submodule "subprojects/libvfio-user"]
195
+    path = subprojects/libvfio-user
196
+    url = https://gitlab.com/qemu-project/libvfio-user.git
197
diff --git a/Kconfig.host b/Kconfig.host
198
index XXXXXXX..XXXXXXX 100644
199
--- a/Kconfig.host
200
+++ b/Kconfig.host
201
@@ -XXX,XX +XXX,XX @@ config MULTIPROCESS_ALLOWED
202
config FUZZ
203
bool
204
select SPARSE_MEM
205
+
206
+config VFIO_USER_SERVER_ALLOWED
207
+ bool
208
+ imply VFIO_USER_SERVER
209
diff --git a/hw/remote/Kconfig b/hw/remote/Kconfig
210
index XXXXXXX..XXXXXXX 100644
211
--- a/hw/remote/Kconfig
212
+++ b/hw/remote/Kconfig
213
@@ -XXX,XX +XXX,XX @@ config MULTIPROCESS
214
bool
215
depends on PCI && PCI_EXPRESS && KVM
216
select REMOTE_PCIHOST
217
+
218
+config VFIO_USER_SERVER
219
+ bool
220
+ depends on MULTIPROCESS
221
diff --git a/hw/remote/meson.build b/hw/remote/meson.build
222
index XXXXXXX..XXXXXXX 100644
223
--- a/hw/remote/meson.build
224
+++ b/hw/remote/meson.build
225
@@ -XXX,XX +XXX,XX @@ remote_ss.add(when: 'CONFIG_MULTIPROCESS', if_true: files('remote-obj.c'))
226
remote_ss.add(when: 'CONFIG_MULTIPROCESS', if_true: files('proxy.c'))
227
remote_ss.add(when: 'CONFIG_MULTIPROCESS', if_true: files('iohub.c'))
228
229
+remote_ss.add(when: 'CONFIG_VFIO_USER_SERVER', if_true: libvfio_user_dep)
230
+
231
specific_ss.add(when: 'CONFIG_MULTIPROCESS', if_true: files('memory.c'))
232
specific_ss.add(when: 'CONFIG_MULTIPROCESS', if_true: files('proxy-memory-listener.c'))
233
234
diff --git a/scripts/meson-buildoptions.sh b/scripts/meson-buildoptions.sh
235
index XXXXXXX..XXXXXXX 100644
236
--- a/scripts/meson-buildoptions.sh
237
+++ b/scripts/meson-buildoptions.sh
238
@@ -XXX,XX +XXX,XX @@ meson_options_help() {
239
printf "%s\n" ' usb-redir libusbredir support'
240
printf "%s\n" ' vde vde network backend support'
241
printf "%s\n" ' vdi vdi image format support'
242
+ printf "%s\n" ' vfio-user-server'
243
+ printf "%s\n" ' vfio-user server support'
244
printf "%s\n" ' vhost-crypto vhost-user crypto backend support'
245
printf "%s\n" ' vhost-kernel vhost kernel backend support'
246
printf "%s\n" ' vhost-net vhost-net kernel acceleration support'
247
@@ -XXX,XX +XXX,XX @@ _meson_option_parse() {
248
--disable-vde) printf "%s" -Dvde=disabled ;;
249
--enable-vdi) printf "%s" -Dvdi=enabled ;;
250
--disable-vdi) printf "%s" -Dvdi=disabled ;;
251
+ --enable-vfio-user-server) printf "%s" -Dvfio_user_server=enabled ;;
252
+ --disable-vfio-user-server) printf "%s" -Dvfio_user_server=disabled ;;
253
--enable-vhost-crypto) printf "%s" -Dvhost_crypto=enabled ;;
254
--disable-vhost-crypto) printf "%s" -Dvhost_crypto=disabled ;;
255
--enable-vhost-kernel) printf "%s" -Dvhost_kernel=enabled ;;
256
diff --git a/subprojects/libvfio-user b/subprojects/libvfio-user
257
new file mode 160000
258
index XXXXXXX..XXXXXXX
259
--- /dev/null
260
+++ b/subprojects/libvfio-user
261
@@ -0,0 +1 @@
262
+Subproject commit 0b28d205572c80b568a1003db2c8f37ca333e4d7
263
diff --git a/tests/docker/dockerfiles/centos8.docker b/tests/docker/dockerfiles/centos8.docker
264
index XXXXXXX..XXXXXXX 100644
265
--- a/tests/docker/dockerfiles/centos8.docker
266
+++ b/tests/docker/dockerfiles/centos8.docker
267
@@ -XXX,XX +XXX,XX @@ RUN dnf update -y && \
268
libbpf-devel \
269
libcacard-devel \
270
libcap-ng-devel \
271
+ libcmocka-devel \
272
libcurl-devel \
273
libdrm-devel \
274
libepoxy-devel \
275
@@ -XXX,XX +XXX,XX @@ RUN dnf update -y && \
276
libgcrypt-devel \
277
libiscsi-devel \
278
libjpeg-devel \
279
+ json-c-devel \
280
libnfs-devel \
281
libpmem-devel \
282
libpng-devel \
283
--
102
--
284
2.36.1
103
2.26.2
104
diff view generated by jsdifflib
1
From: Sam Li <faithilikerun@gmail.com>
1
Introduce libblkdev.fa to avoid recompiling blockdev_ss twice.
2
2
3
Linux recently added a new io_uring(7) optimization API that QEMU
3
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
4
doesn't take advantage of yet. The liburing library that QEMU uses
4
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
5
has added a corresponding new API calling io_uring_register_ring_fd().
5
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
6
When this API is called after creating the ring, the io_uring_submit()
6
Message-id: 20200929125516.186715-3-stefanha@redhat.com
7
library function passes a flag to the io_uring_enter(2) syscall
8
allowing it to skip the ring file descriptor fdget()/fdput()
9
operations. This saves some CPU cycles.
10
11
Signed-off-by: Sam Li <faithilikerun@gmail.com>
12
Message-id: 20220531105011.111082-1-faithilikerun@gmail.com
13
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
7
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
14
---
8
---
15
meson.build | 1 +
9
meson.build | 12 ++++++++++--
16
block/io_uring.c | 12 +++++++++++-
10
storage-daemon/meson.build | 3 +--
17
2 files changed, 12 insertions(+), 1 deletion(-)
11
2 files changed, 11 insertions(+), 4 deletions(-)
18
12
19
diff --git a/meson.build b/meson.build
13
diff --git a/meson.build b/meson.build
20
index XXXXXXX..XXXXXXX 100644
14
index XXXXXXX..XXXXXXX 100644
21
--- a/meson.build
15
--- a/meson.build
22
+++ b/meson.build
16
+++ b/meson.build
23
@@ -XXX,XX +XXX,XX @@ config_host_data.set('CONFIG_LIBNFS', libnfs.found())
17
@@ -XXX,XX +XXX,XX @@ blockdev_ss.add(files(
24
config_host_data.set('CONFIG_LIBSSH', libssh.found())
18
# os-win32.c does not
25
config_host_data.set('CONFIG_LINUX_AIO', libaio.found())
19
blockdev_ss.add(when: 'CONFIG_POSIX', if_true: files('os-posix.c'))
26
config_host_data.set('CONFIG_LINUX_IO_URING', linux_io_uring.found())
20
softmmu_ss.add(when: 'CONFIG_WIN32', if_true: [files('os-win32.c')])
27
+config_host_data.set('CONFIG_LIBURING_REGISTER_RING_FD', cc.has_function('io_uring_register_ring_fd', prefix: '#include <liburing.h>', dependencies:linux_io_uring))
21
-softmmu_ss.add_all(blockdev_ss)
28
config_host_data.set('CONFIG_LIBPMEM', libpmem.found())
22
29
config_host_data.set('CONFIG_NUMA', numa.found())
23
common_ss.add(files('cpus-common.c'))
30
config_host_data.set('CONFIG_OPENGL', opengl.found())
24
31
diff --git a/block/io_uring.c b/block/io_uring.c
25
@@ -XXX,XX +XXX,XX @@ block = declare_dependency(link_whole: [libblock],
26
link_args: '@block.syms',
27
dependencies: [crypto, io])
28
29
+blockdev_ss = blockdev_ss.apply(config_host, strict: false)
30
+libblockdev = static_library('blockdev', blockdev_ss.sources() + genh,
31
+ dependencies: blockdev_ss.dependencies(),
32
+ name_suffix: 'fa',
33
+ build_by_default: false)
34
+
35
+blockdev = declare_dependency(link_whole: [libblockdev],
36
+ dependencies: [block])
37
+
38
qmp_ss = qmp_ss.apply(config_host, strict: false)
39
libqmp = static_library('qmp', qmp_ss.sources() + genh,
40
dependencies: qmp_ss.dependencies(),
41
@@ -XXX,XX +XXX,XX @@ foreach m : block_mods + softmmu_mods
42
install_dir: config_host['qemu_moddir'])
43
endforeach
44
45
-softmmu_ss.add(authz, block, chardev, crypto, io, qmp)
46
+softmmu_ss.add(authz, blockdev, chardev, crypto, io, qmp)
47
common_ss.add(qom, qemuutil)
48
49
common_ss.add_all(when: 'CONFIG_SOFTMMU', if_true: [softmmu_ss])
50
diff --git a/storage-daemon/meson.build b/storage-daemon/meson.build
32
index XXXXXXX..XXXXXXX 100644
51
index XXXXXXX..XXXXXXX 100644
33
--- a/block/io_uring.c
52
--- a/storage-daemon/meson.build
34
+++ b/block/io_uring.c
53
+++ b/storage-daemon/meson.build
35
@@ -XXX,XX +XXX,XX @@
54
@@ -XXX,XX +XXX,XX @@
36
#include "qapi/error.h"
55
qsd_ss = ss.source_set()
37
#include "trace.h"
56
qsd_ss.add(files('qemu-storage-daemon.c'))
38
57
-qsd_ss.add(block, chardev, qmp, qom, qemuutil)
39
+
58
-qsd_ss.add_all(blockdev_ss)
40
/* io_uring ring size */
59
+qsd_ss.add(blockdev, chardev, qmp, qom, qemuutil)
41
#define MAX_ENTRIES 128
60
42
61
subdir('qapi')
43
@@ -XXX,XX +XXX,XX @@ LuringState *luring_init(Error **errp)
62
44
}
45
46
ioq_init(&s->io_q);
47
+#ifdef CONFIG_LIBURING_REGISTER_RING_FD
48
+ if (io_uring_register_ring_fd(&s->ring) < 0) {
49
+ /*
50
+ * Only warn about this error: we will fallback to the non-optimized
51
+ * io_uring operations.
52
+ */
53
+ warn_report("failed to register linux io_uring ring file descriptor");
54
+ }
55
+#endif
56
+
57
return s;
58
-
59
}
60
61
void luring_cleanup(LuringState *s)
62
--
63
--
63
2.36.1
64
2.26.2
65
diff view generated by jsdifflib
1
From: Jagannathan Raman <jag.raman@oracle.com>
1
Block exports are used by softmmu, qemu-storage-daemon, and qemu-nbd.
2
They are not used by other programs and are not otherwise needed in
3
libblock.
2
4
3
Forward remote device's interrupts to the guest
5
Undo the recent move of blockdev-nbd.c from blockdev_ss into block_ss.
6
Since bdrv_close_all() (libblock) calls blk_exp_close_all()
7
(libblockdev) a stub function is required..
4
8
5
Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
9
Make qemu-nbd.c use signal handling utility functions instead of
6
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
10
duplicating the code. This helps because os-posix.c is in libblockdev
7
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
11
and it depends on a qemu_system_killed() symbol that qemu-nbd.c lacks.
8
Message-id: 9523479eaafe050677f4de2af5dd0df18c27cfd9.1655151679.git.jag.raman@oracle.com
12
Once we use the signal handling utility functions we also end up
13
providing the necessary symbol.
14
15
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
16
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
17
Reviewed-by: Eric Blake <eblake@redhat.com>
18
Message-id: 20200929125516.186715-4-stefanha@redhat.com
19
[Fixed s/ndb/nbd/ typo in commit description as suggested by Eric Blake
20
--Stefan]
9
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
21
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10
---
22
---
11
MAINTAINERS | 1 +
23
qemu-nbd.c | 21 ++++++++-------------
12
include/hw/pci/msi.h | 1 +
24
stubs/blk-exp-close-all.c | 7 +++++++
13
include/hw/pci/msix.h | 1 +
25
block/export/meson.build | 4 ++--
14
include/hw/pci/pci.h | 13 +++
26
meson.build | 4 ++--
15
include/hw/remote/vfio-user-obj.h | 6 ++
27
nbd/meson.build | 2 ++
16
hw/pci/msi.c | 49 +++++++--
28
stubs/meson.build | 1 +
17
hw/pci/msix.c | 35 ++++++-
29
6 files changed, 22 insertions(+), 17 deletions(-)
18
hw/pci/pci.c | 13 +++
30
create mode 100644 stubs/blk-exp-close-all.c
19
hw/remote/machine.c | 16 ++-
20
hw/remote/vfio-user-obj.c | 167 ++++++++++++++++++++++++++++++
21
stubs/vfio-user-obj.c | 6 ++
22
hw/remote/trace-events | 1 +
23
stubs/meson.build | 1 +
24
13 files changed, 298 insertions(+), 12 deletions(-)
25
create mode 100644 include/hw/remote/vfio-user-obj.h
26
create mode 100644 stubs/vfio-user-obj.c
27
31
28
diff --git a/MAINTAINERS b/MAINTAINERS
32
diff --git a/qemu-nbd.c b/qemu-nbd.c
29
index XXXXXXX..XXXXXXX 100644
33
index XXXXXXX..XXXXXXX 100644
30
--- a/MAINTAINERS
34
--- a/qemu-nbd.c
31
+++ b/MAINTAINERS
35
+++ b/qemu-nbd.c
32
@@ -XXX,XX +XXX,XX @@ F: hw/remote/iohub.c
36
@@ -XXX,XX +XXX,XX @@
33
F: include/hw/remote/iohub.h
37
#include "qapi/error.h"
34
F: subprojects/libvfio-user
38
#include "qemu/cutils.h"
35
F: hw/remote/vfio-user-obj.c
39
#include "sysemu/block-backend.h"
36
+F: include/hw/remote/vfio-user-obj.h
40
+#include "sysemu/runstate.h" /* for qemu_system_killed() prototype */
37
F: hw/remote/iommu.c
41
#include "block/block_int.h"
38
F: include/hw/remote/iommu.h
42
#include "block/nbd.h"
39
43
#include "qemu/main-loop.h"
40
diff --git a/include/hw/pci/msi.h b/include/hw/pci/msi.h
44
@@ -XXX,XX +XXX,XX @@ QEMU_COPYRIGHT "\n"
41
index XXXXXXX..XXXXXXX 100644
45
}
42
--- a/include/hw/pci/msi.h
46
43
+++ b/include/hw/pci/msi.h
47
#ifdef CONFIG_POSIX
44
@@ -XXX,XX +XXX,XX @@ void msi_notify(PCIDevice *dev, unsigned int vector);
48
-static void termsig_handler(int signum)
45
void msi_send_message(PCIDevice *dev, MSIMessage msg);
49
+/*
46
void msi_write_config(PCIDevice *dev, uint32_t addr, uint32_t val, int len);
50
+ * The client thread uses SIGTERM to interrupt the server. A signal
47
unsigned int msi_nr_vectors_allocated(const PCIDevice *dev);
51
+ * handler ensures that "qemu-nbd -v -c" exits with a nice status code.
48
+void msi_set_mask(PCIDevice *dev, int vector, bool mask, Error **errp);
52
+ */
49
53
+void qemu_system_killed(int signum, pid_t pid)
50
static inline bool msi_present(const PCIDevice *dev)
51
{
54
{
52
diff --git a/include/hw/pci/msix.h b/include/hw/pci/msix.h
55
qatomic_cmpxchg(&state, RUNNING, TERMINATE);
53
index XXXXXXX..XXXXXXX 100644
56
qemu_notify_event();
54
--- a/include/hw/pci/msix.h
57
@@ -XXX,XX +XXX,XX @@ int main(int argc, char **argv)
55
+++ b/include/hw/pci/msix.h
58
BlockExportOptions *export_opts;
56
@@ -XXX,XX +XXX,XX @@ void msix_clr_pending(PCIDevice *dev, int vector);
59
57
int msix_vector_use(PCIDevice *dev, unsigned vector);
60
#ifdef CONFIG_POSIX
58
void msix_vector_unuse(PCIDevice *dev, unsigned vector);
61
- /*
59
void msix_unuse_all_vectors(PCIDevice *dev);
62
- * Exit gracefully on various signals, which includes SIGTERM used
60
+void msix_set_mask(PCIDevice *dev, int vector, bool mask, Error **errp);
63
- * by 'qemu-nbd -v -c'.
61
64
- */
62
void msix_notify(PCIDevice *dev, unsigned vector);
65
- struct sigaction sa_sigterm;
63
66
- memset(&sa_sigterm, 0, sizeof(sa_sigterm));
64
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
67
- sa_sigterm.sa_handler = termsig_handler;
65
index XXXXXXX..XXXXXXX 100644
68
- sigaction(SIGTERM, &sa_sigterm, NULL);
66
--- a/include/hw/pci/pci.h
69
- sigaction(SIGINT, &sa_sigterm, NULL);
67
+++ b/include/hw/pci/pci.h
70
- sigaction(SIGHUP, &sa_sigterm, NULL);
68
@@ -XXX,XX +XXX,XX @@ extern bool pci_available;
71
-
69
#define PCI_SLOT(devfn) (((devfn) >> 3) & 0x1f)
72
- signal(SIGPIPE, SIG_IGN);
70
#define PCI_FUNC(devfn) ((devfn) & 0x07)
73
+ os_setup_early_signal_handling();
71
#define PCI_BUILD_BDF(bus, devfn) ((bus << 8) | (devfn))
74
+ os_setup_signal_handling();
72
+#define PCI_BDF_TO_DEVFN(x) ((x) & 0xff)
75
#endif
73
#define PCI_BUS_MAX 256
76
74
#define PCI_DEVFN_MAX 256
77
socket_init();
75
#define PCI_SLOT_MAX 32
78
diff --git a/stubs/blk-exp-close-all.c b/stubs/blk-exp-close-all.c
76
@@ -XXX,XX +XXX,XX @@ typedef void PCIMapIORegionFunc(PCIDevice *pci_dev, int region_num,
77
pcibus_t addr, pcibus_t size, int type);
78
typedef void PCIUnregisterFunc(PCIDevice *pci_dev);
79
80
+typedef void MSITriggerFunc(PCIDevice *dev, MSIMessage msg);
81
+typedef MSIMessage MSIPrepareMessageFunc(PCIDevice *dev, unsigned vector);
82
+typedef MSIMessage MSIxPrepareMessageFunc(PCIDevice *dev, unsigned vector);
83
+
84
typedef struct PCIIORegion {
85
pcibus_t addr; /* current PCI mapping address. -1 means not mapped */
86
#define PCI_BAR_UNMAPPED (~(pcibus_t)0)
87
@@ -XXX,XX +XXX,XX @@ struct PCIDevice {
88
/* Space to store MSIX table & pending bit array */
89
uint8_t *msix_table;
90
uint8_t *msix_pba;
91
+
92
+ /* May be used by INTx or MSI during interrupt notification */
93
+ void *irq_opaque;
94
+
95
+ MSITriggerFunc *msi_trigger;
96
+ MSIPrepareMessageFunc *msi_prepare_message;
97
+ MSIxPrepareMessageFunc *msix_prepare_message;
98
+
99
/* MemoryRegion container for msix exclusive BAR setup */
100
MemoryRegion msix_exclusive_bar;
101
/* Memory Regions for MSIX table and pending bit entries. */
102
diff --git a/include/hw/remote/vfio-user-obj.h b/include/hw/remote/vfio-user-obj.h
103
new file mode 100644
79
new file mode 100644
104
index XXXXXXX..XXXXXXX
80
index XXXXXXX..XXXXXXX
105
--- /dev/null
81
--- /dev/null
106
+++ b/include/hw/remote/vfio-user-obj.h
82
+++ b/stubs/blk-exp-close-all.c
107
@@ -XXX,XX +XXX,XX @@
108
+#ifndef VFIO_USER_OBJ_H
109
+#define VFIO_USER_OBJ_H
110
+
111
+void vfu_object_set_bus_irq(PCIBus *pci_bus);
112
+
113
+#endif
114
diff --git a/hw/pci/msi.c b/hw/pci/msi.c
115
index XXXXXXX..XXXXXXX 100644
116
--- a/hw/pci/msi.c
117
+++ b/hw/pci/msi.c
118
@@ -XXX,XX +XXX,XX @@ void msi_set_message(PCIDevice *dev, MSIMessage msg)
119
pci_set_word(dev->config + msi_data_off(dev, msi64bit), msg.data);
120
}
121
122
-MSIMessage msi_get_message(PCIDevice *dev, unsigned int vector)
123
+static MSIMessage msi_prepare_message(PCIDevice *dev, unsigned int vector)
124
{
125
uint16_t flags = pci_get_word(dev->config + msi_flags_off(dev));
126
bool msi64bit = flags & PCI_MSI_FLAGS_64BIT;
127
@@ -XXX,XX +XXX,XX @@ MSIMessage msi_get_message(PCIDevice *dev, unsigned int vector)
128
return msg;
129
}
130
131
+MSIMessage msi_get_message(PCIDevice *dev, unsigned int vector)
132
+{
133
+ return dev->msi_prepare_message(dev, vector);
134
+}
135
+
136
bool msi_enabled(const PCIDevice *dev)
137
{
138
return msi_present(dev) &&
139
@@ -XXX,XX +XXX,XX @@ int msi_init(struct PCIDevice *dev, uint8_t offset,
140
0xffffffff >> (PCI_MSI_VECTORS_MAX - nr_vectors));
141
}
142
143
+ dev->msi_prepare_message = msi_prepare_message;
144
+
145
return 0;
146
}
147
148
@@ -XXX,XX +XXX,XX @@ void msi_uninit(struct PCIDevice *dev)
149
cap_size = msi_cap_sizeof(flags);
150
pci_del_capability(dev, PCI_CAP_ID_MSI, cap_size);
151
dev->cap_present &= ~QEMU_PCI_CAP_MSI;
152
+ dev->msi_prepare_message = NULL;
153
154
MSI_DEV_PRINTF(dev, "uninit\n");
155
}
156
@@ -XXX,XX +XXX,XX @@ bool msi_is_masked(const PCIDevice *dev, unsigned int vector)
157
return mask & (1U << vector);
158
}
159
160
+void msi_set_mask(PCIDevice *dev, int vector, bool mask, Error **errp)
161
+{
162
+ ERRP_GUARD();
163
+ uint16_t flags = pci_get_word(dev->config + msi_flags_off(dev));
164
+ bool msi64bit = flags & PCI_MSI_FLAGS_64BIT;
165
+ uint32_t irq_state, vector_mask, pending;
166
+
167
+ if (vector > PCI_MSI_VECTORS_MAX) {
168
+ error_setg(errp, "msi: vector %d not allocated. max vector is %d",
169
+ vector, PCI_MSI_VECTORS_MAX);
170
+ return;
171
+ }
172
+
173
+ vector_mask = (1U << vector);
174
+
175
+ irq_state = pci_get_long(dev->config + msi_mask_off(dev, msi64bit));
176
+
177
+ if (mask) {
178
+ irq_state |= vector_mask;
179
+ } else {
180
+ irq_state &= ~vector_mask;
181
+ }
182
+
183
+ pci_set_long(dev->config + msi_mask_off(dev, msi64bit), irq_state);
184
+
185
+ pending = pci_get_long(dev->config + msi_pending_off(dev, msi64bit));
186
+ if (!mask && (pending & vector_mask)) {
187
+ pending &= ~vector_mask;
188
+ pci_set_long(dev->config + msi_pending_off(dev, msi64bit), pending);
189
+ msi_notify(dev, vector);
190
+ }
191
+}
192
+
193
void msi_notify(PCIDevice *dev, unsigned int vector)
194
{
195
uint16_t flags = pci_get_word(dev->config + msi_flags_off(dev));
196
@@ -XXX,XX +XXX,XX @@ void msi_notify(PCIDevice *dev, unsigned int vector)
197
198
void msi_send_message(PCIDevice *dev, MSIMessage msg)
199
{
200
- MemTxAttrs attrs = {};
201
-
202
- attrs.requester_id = pci_requester_id(dev);
203
- address_space_stl_le(&dev->bus_master_as, msg.address, msg.data,
204
- attrs, NULL);
205
+ dev->msi_trigger(dev, msg);
206
}
207
208
/* Normally called by pci_default_write_config(). */
209
diff --git a/hw/pci/msix.c b/hw/pci/msix.c
210
index XXXXXXX..XXXXXXX 100644
211
--- a/hw/pci/msix.c
212
+++ b/hw/pci/msix.c
213
@@ -XXX,XX +XXX,XX @@
214
#define MSIX_ENABLE_MASK (PCI_MSIX_FLAGS_ENABLE >> 8)
215
#define MSIX_MASKALL_MASK (PCI_MSIX_FLAGS_MASKALL >> 8)
216
217
-MSIMessage msix_get_message(PCIDevice *dev, unsigned vector)
218
+static MSIMessage msix_prepare_message(PCIDevice *dev, unsigned vector)
219
{
220
uint8_t *table_entry = dev->msix_table + vector * PCI_MSIX_ENTRY_SIZE;
221
MSIMessage msg;
222
@@ -XXX,XX +XXX,XX @@ MSIMessage msix_get_message(PCIDevice *dev, unsigned vector)
223
return msg;
224
}
225
226
+MSIMessage msix_get_message(PCIDevice *dev, unsigned vector)
227
+{
228
+ return dev->msix_prepare_message(dev, vector);
229
+}
230
+
231
/*
232
* Special API for POWER to configure the vectors through
233
* a side channel. Should never be used by devices.
234
@@ -XXX,XX +XXX,XX @@ static void msix_handle_mask_update(PCIDevice *dev, int vector, bool was_masked)
235
}
236
}
237
238
+void msix_set_mask(PCIDevice *dev, int vector, bool mask, Error **errp)
239
+{
240
+ ERRP_GUARD();
241
+ unsigned offset;
242
+ bool was_masked;
243
+
244
+ if (vector > dev->msix_entries_nr) {
245
+ error_setg(errp, "msix: vector %d not allocated. max vector is %d",
246
+ vector, dev->msix_entries_nr);
247
+ return;
248
+ }
249
+
250
+ offset = vector * PCI_MSIX_ENTRY_SIZE + PCI_MSIX_ENTRY_VECTOR_CTRL;
251
+
252
+ was_masked = msix_is_masked(dev, vector);
253
+
254
+ if (mask) {
255
+ dev->msix_table[offset] |= PCI_MSIX_ENTRY_CTRL_MASKBIT;
256
+ } else {
257
+ dev->msix_table[offset] &= ~PCI_MSIX_ENTRY_CTRL_MASKBIT;
258
+ }
259
+
260
+ msix_handle_mask_update(dev, vector, was_masked);
261
+}
262
+
263
static bool msix_masked(PCIDevice *dev)
264
{
265
return dev->config[dev->msix_cap + MSIX_CONTROL_OFFSET] & MSIX_MASKALL_MASK;
266
@@ -XXX,XX +XXX,XX @@ int msix_init(struct PCIDevice *dev, unsigned short nentries,
267
"msix-pba", pba_size);
268
memory_region_add_subregion(pba_bar, pba_offset, &dev->msix_pba_mmio);
269
270
+ dev->msix_prepare_message = msix_prepare_message;
271
+
272
return 0;
273
}
274
275
@@ -XXX,XX +XXX,XX @@ void msix_uninit(PCIDevice *dev, MemoryRegion *table_bar, MemoryRegion *pba_bar)
276
g_free(dev->msix_entry_used);
277
dev->msix_entry_used = NULL;
278
dev->cap_present &= ~QEMU_PCI_CAP_MSIX;
279
+ dev->msix_prepare_message = NULL;
280
}
281
282
void msix_uninit_exclusive_bar(PCIDevice *dev)
283
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
284
index XXXXXXX..XXXXXXX 100644
285
--- a/hw/pci/pci.c
286
+++ b/hw/pci/pci.c
287
@@ -XXX,XX +XXX,XX @@ void pci_device_deassert_intx(PCIDevice *dev)
288
}
289
}
290
291
+static void pci_msi_trigger(PCIDevice *dev, MSIMessage msg)
292
+{
293
+ MemTxAttrs attrs = {};
294
+
295
+ attrs.requester_id = pci_requester_id(dev);
296
+ address_space_stl_le(&dev->bus_master_as, msg.address, msg.data,
297
+ attrs, NULL);
298
+}
299
+
300
static void pci_reset_regions(PCIDevice *dev)
301
{
302
int r;
303
@@ -XXX,XX +XXX,XX @@ static void pci_qdev_unrealize(DeviceState *dev)
304
305
pci_device_deassert_intx(pci_dev);
306
do_pci_unregister_device(pci_dev);
307
+
308
+ pci_dev->msi_trigger = NULL;
309
}
310
311
void pci_register_bar(PCIDevice *pci_dev, int region_num,
312
@@ -XXX,XX +XXX,XX @@ static void pci_qdev_realize(DeviceState *qdev, Error **errp)
313
}
314
315
pci_set_power(pci_dev, true);
316
+
317
+ pci_dev->msi_trigger = pci_msi_trigger;
318
}
319
320
PCIDevice *pci_new_multifunction(int devfn, bool multifunction,
321
diff --git a/hw/remote/machine.c b/hw/remote/machine.c
322
index XXXXXXX..XXXXXXX 100644
323
--- a/hw/remote/machine.c
324
+++ b/hw/remote/machine.c
325
@@ -XXX,XX +XXX,XX @@
326
#include "hw/remote/iommu.h"
327
#include "hw/qdev-core.h"
328
#include "hw/remote/iommu.h"
329
+#include "hw/remote/vfio-user-obj.h"
330
+#include "hw/pci/msi.h"
331
332
static void remote_machine_init(MachineState *machine)
333
{
334
@@ -XXX,XX +XXX,XX @@ static void remote_machine_init(MachineState *machine)
335
336
if (s->vfio_user) {
337
remote_iommu_setup(pci_host->bus);
338
+
339
+ msi_nonbroken = true;
340
+
341
+ vfu_object_set_bus_irq(pci_host->bus);
342
+ } else {
343
+ remote_iohub_init(&s->iohub);
344
+
345
+ pci_bus_irqs(pci_host->bus, remote_iohub_set_irq, remote_iohub_map_irq,
346
+ &s->iohub, REMOTE_IOHUB_NB_PIRQS);
347
}
348
349
- remote_iohub_init(&s->iohub);
350
-
351
- pci_bus_irqs(pci_host->bus, remote_iohub_set_irq, remote_iohub_map_irq,
352
- &s->iohub, REMOTE_IOHUB_NB_PIRQS);
353
-
354
qbus_set_hotplug_handler(BUS(pci_host->bus), OBJECT(s));
355
}
356
357
diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c
358
index XXXXXXX..XXXXXXX 100644
359
--- a/hw/remote/vfio-user-obj.c
360
+++ b/hw/remote/vfio-user-obj.c
361
@@ -XXX,XX +XXX,XX @@
362
#include "hw/pci/pci.h"
363
#include "qemu/timer.h"
364
#include "exec/memory.h"
365
+#include "hw/pci/msi.h"
366
+#include "hw/pci/msix.h"
367
+#include "hw/remote/vfio-user-obj.h"
368
369
#define TYPE_VFU_OBJECT "x-vfio-user-server"
370
OBJECT_DECLARE_TYPE(VfuObject, VfuObjectClass, VFU_OBJECT)
371
@@ -XXX,XX +XXX,XX @@ struct VfuObject {
372
Error *unplug_blocker;
373
374
int vfu_poll_fd;
375
+
376
+ MSITriggerFunc *default_msi_trigger;
377
+ MSIPrepareMessageFunc *default_msi_prepare_message;
378
+ MSIxPrepareMessageFunc *default_msix_prepare_message;
379
};
380
381
static void vfu_object_init_ctx(VfuObject *o, Error **errp);
382
@@ -XXX,XX +XXX,XX @@ static void vfu_object_register_bars(vfu_ctx_t *vfu_ctx, PCIDevice *pdev)
383
}
384
}
385
386
+static int vfu_object_map_irq(PCIDevice *pci_dev, int intx)
387
+{
388
+ int pci_bdf = PCI_BUILD_BDF(pci_bus_num(pci_get_bus(pci_dev)),
389
+ pci_dev->devfn);
390
+
391
+ return pci_bdf;
392
+}
393
+
394
+static void vfu_object_set_irq(void *opaque, int pirq, int level)
395
+{
396
+ PCIBus *pci_bus = opaque;
397
+ PCIDevice *pci_dev = NULL;
398
+ vfu_ctx_t *vfu_ctx = NULL;
399
+ int pci_bus_num, devfn;
400
+
401
+ if (level) {
402
+ pci_bus_num = PCI_BUS_NUM(pirq);
403
+ devfn = PCI_BDF_TO_DEVFN(pirq);
404
+
405
+ /*
406
+ * pci_find_device() performs at O(1) if the device is attached
407
+ * to the root PCI bus. Whereas, if the device is attached to a
408
+ * secondary PCI bus (such as when a root port is involved),
409
+ * finding the parent PCI bus could take O(n)
410
+ */
411
+ pci_dev = pci_find_device(pci_bus, pci_bus_num, devfn);
412
+
413
+ vfu_ctx = pci_dev->irq_opaque;
414
+
415
+ g_assert(vfu_ctx);
416
+
417
+ vfu_irq_trigger(vfu_ctx, 0);
418
+ }
419
+}
420
+
421
+static MSIMessage vfu_object_msi_prepare_msg(PCIDevice *pci_dev,
422
+ unsigned int vector)
423
+{
424
+ MSIMessage msg;
425
+
426
+ msg.address = 0;
427
+ msg.data = vector;
428
+
429
+ return msg;
430
+}
431
+
432
+static void vfu_object_msi_trigger(PCIDevice *pci_dev, MSIMessage msg)
433
+{
434
+ vfu_ctx_t *vfu_ctx = pci_dev->irq_opaque;
435
+
436
+ vfu_irq_trigger(vfu_ctx, msg.data);
437
+}
438
+
439
+static void vfu_object_setup_msi_cbs(VfuObject *o)
440
+{
441
+ o->default_msi_trigger = o->pci_dev->msi_trigger;
442
+ o->default_msi_prepare_message = o->pci_dev->msi_prepare_message;
443
+ o->default_msix_prepare_message = o->pci_dev->msix_prepare_message;
444
+
445
+ o->pci_dev->msi_trigger = vfu_object_msi_trigger;
446
+ o->pci_dev->msi_prepare_message = vfu_object_msi_prepare_msg;
447
+ o->pci_dev->msix_prepare_message = vfu_object_msi_prepare_msg;
448
+}
449
+
450
+static void vfu_object_restore_msi_cbs(VfuObject *o)
451
+{
452
+ o->pci_dev->msi_trigger = o->default_msi_trigger;
453
+ o->pci_dev->msi_prepare_message = o->default_msi_prepare_message;
454
+ o->pci_dev->msix_prepare_message = o->default_msix_prepare_message;
455
+}
456
+
457
+static void vfu_msix_irq_state(vfu_ctx_t *vfu_ctx, uint32_t start,
458
+ uint32_t count, bool mask)
459
+{
460
+ VfuObject *o = vfu_get_private(vfu_ctx);
461
+ Error *err = NULL;
462
+ uint32_t vector;
463
+
464
+ for (vector = start; vector < count; vector++) {
465
+ msix_set_mask(o->pci_dev, vector, mask, &err);
466
+ if (err) {
467
+ VFU_OBJECT_ERROR(o, "vfu: %s: %s", o->device,
468
+ error_get_pretty(err));
469
+ error_free(err);
470
+ err = NULL;
471
+ }
472
+ }
473
+}
474
+
475
+static void vfu_msi_irq_state(vfu_ctx_t *vfu_ctx, uint32_t start,
476
+ uint32_t count, bool mask)
477
+{
478
+ VfuObject *o = vfu_get_private(vfu_ctx);
479
+ Error *err = NULL;
480
+ uint32_t vector;
481
+
482
+ for (vector = start; vector < count; vector++) {
483
+ msi_set_mask(o->pci_dev, vector, mask, &err);
484
+ if (err) {
485
+ VFU_OBJECT_ERROR(o, "vfu: %s: %s", o->device,
486
+ error_get_pretty(err));
487
+ error_free(err);
488
+ err = NULL;
489
+ }
490
+ }
491
+}
492
+
493
+static int vfu_object_setup_irqs(VfuObject *o, PCIDevice *pci_dev)
494
+{
495
+ vfu_ctx_t *vfu_ctx = o->vfu_ctx;
496
+ int ret;
497
+
498
+ ret = vfu_setup_device_nr_irqs(vfu_ctx, VFU_DEV_INTX_IRQ, 1);
499
+ if (ret < 0) {
500
+ return ret;
501
+ }
502
+
503
+ if (msix_nr_vectors_allocated(pci_dev)) {
504
+ ret = vfu_setup_device_nr_irqs(vfu_ctx, VFU_DEV_MSIX_IRQ,
505
+ msix_nr_vectors_allocated(pci_dev));
506
+ vfu_setup_irq_state_callback(vfu_ctx, VFU_DEV_MSIX_IRQ,
507
+ &vfu_msix_irq_state);
508
+ } else if (msi_nr_vectors_allocated(pci_dev)) {
509
+ ret = vfu_setup_device_nr_irqs(vfu_ctx, VFU_DEV_MSI_IRQ,
510
+ msi_nr_vectors_allocated(pci_dev));
511
+ vfu_setup_irq_state_callback(vfu_ctx, VFU_DEV_MSI_IRQ,
512
+ &vfu_msi_irq_state);
513
+ }
514
+
515
+ if (ret < 0) {
516
+ return ret;
517
+ }
518
+
519
+ vfu_object_setup_msi_cbs(o);
520
+
521
+ pci_dev->irq_opaque = vfu_ctx;
522
+
523
+ return 0;
524
+}
525
+
526
+void vfu_object_set_bus_irq(PCIBus *pci_bus)
527
+{
528
+ int bus_num = pci_bus_num(pci_bus);
529
+ int max_bdf = PCI_BUILD_BDF(bus_num, PCI_DEVFN_MAX - 1);
530
+
531
+ pci_bus_irqs(pci_bus, vfu_object_set_irq, vfu_object_map_irq, pci_bus,
532
+ max_bdf);
533
+}
534
+
535
/*
536
* TYPE_VFU_OBJECT depends on the availability of the 'socket' and 'device'
537
* properties. It also depends on devices instantiated in QEMU. These
538
@@ -XXX,XX +XXX,XX @@ static void vfu_object_init_ctx(VfuObject *o, Error **errp)
539
540
vfu_object_register_bars(o->vfu_ctx, o->pci_dev);
541
542
+ ret = vfu_object_setup_irqs(o, o->pci_dev);
543
+ if (ret < 0) {
544
+ error_setg(errp, "vfu: Failed to setup interrupts for %s",
545
+ o->device);
546
+ goto fail;
547
+ }
548
+
549
ret = vfu_realize_ctx(o->vfu_ctx);
550
if (ret < 0) {
551
error_setg(errp, "vfu: Failed to realize device %s- %s",
552
@@ -XXX,XX +XXX,XX @@ fail:
553
o->unplug_blocker = NULL;
554
}
555
if (o->pci_dev) {
556
+ vfu_object_restore_msi_cbs(o);
557
+ o->pci_dev->irq_opaque = NULL;
558
object_unref(OBJECT(o->pci_dev));
559
o->pci_dev = NULL;
560
}
561
@@ -XXX,XX +XXX,XX @@ static void vfu_object_finalize(Object *obj)
562
}
563
564
if (o->pci_dev) {
565
+ vfu_object_restore_msi_cbs(o);
566
+ o->pci_dev->irq_opaque = NULL;
567
object_unref(OBJECT(o->pci_dev));
568
o->pci_dev = NULL;
569
}
570
diff --git a/stubs/vfio-user-obj.c b/stubs/vfio-user-obj.c
571
new file mode 100644
572
index XXXXXXX..XXXXXXX
573
--- /dev/null
574
+++ b/stubs/vfio-user-obj.c
575
@@ -XXX,XX +XXX,XX @@
83
@@ -XXX,XX +XXX,XX @@
576
+#include "qemu/osdep.h"
84
+#include "qemu/osdep.h"
577
+#include "hw/remote/vfio-user-obj.h"
85
+#include "block/export.h"
578
+
86
+
579
+void vfu_object_set_bus_irq(PCIBus *pci_bus)
87
+/* Only used in programs that support block exports (libblockdev.fa) */
88
+void blk_exp_close_all(void)
580
+{
89
+{
581
+}
90
+}
582
diff --git a/hw/remote/trace-events b/hw/remote/trace-events
91
diff --git a/block/export/meson.build b/block/export/meson.build
583
index XXXXXXX..XXXXXXX 100644
92
index XXXXXXX..XXXXXXX 100644
584
--- a/hw/remote/trace-events
93
--- a/block/export/meson.build
585
+++ b/hw/remote/trace-events
94
+++ b/block/export/meson.build
586
@@ -XXX,XX +XXX,XX @@ vfu_dma_unregister(uint64_t gpa) "vfu: unregistering GPA 0x%"PRIx64""
95
@@ -XXX,XX +XXX,XX @@
587
vfu_bar_register(int i, uint64_t addr, uint64_t size) "vfu: BAR %d: addr 0x%"PRIx64" size 0x%"PRIx64""
96
-block_ss.add(files('export.c'))
588
vfu_bar_rw_enter(const char *op, uint64_t addr) "vfu: %s request for BAR address 0x%"PRIx64""
97
-block_ss.add(when: ['CONFIG_LINUX', 'CONFIG_VHOST_USER'], if_true: files('vhost-user-blk-server.c'))
589
vfu_bar_rw_exit(const char *op, uint64_t addr) "vfu: Finished %s of BAR address 0x%"PRIx64""
98
+blockdev_ss.add(files('export.c'))
590
+vfu_interrupt(int pirq) "vfu: sending interrupt to device - PIRQ %d"
99
+blockdev_ss.add(when: ['CONFIG_LINUX', 'CONFIG_VHOST_USER'], if_true: files('vhost-user-blk-server.c'))
100
diff --git a/meson.build b/meson.build
101
index XXXXXXX..XXXXXXX 100644
102
--- a/meson.build
103
+++ b/meson.build
104
@@ -XXX,XX +XXX,XX @@ subdir('dump')
105
106
block_ss.add(files(
107
'block.c',
108
- 'blockdev-nbd.c',
109
'blockjob.c',
110
'job.c',
111
'qemu-io-cmds.c',
112
@@ -XXX,XX +XXX,XX @@ subdir('block')
113
114
blockdev_ss.add(files(
115
'blockdev.c',
116
+ 'blockdev-nbd.c',
117
'iothread.c',
118
'job-qmp.c',
119
))
120
@@ -XXX,XX +XXX,XX @@ if have_tools
121
qemu_io = executable('qemu-io', files('qemu-io.c'),
122
dependencies: [block, qemuutil], install: true)
123
qemu_nbd = executable('qemu-nbd', files('qemu-nbd.c'),
124
- dependencies: [block, qemuutil], install: true)
125
+ dependencies: [blockdev, qemuutil], install: true)
126
127
subdir('storage-daemon')
128
subdir('contrib/rdmacm-mux')
129
diff --git a/nbd/meson.build b/nbd/meson.build
130
index XXXXXXX..XXXXXXX 100644
131
--- a/nbd/meson.build
132
+++ b/nbd/meson.build
133
@@ -XXX,XX +XXX,XX @@
134
block_ss.add(files(
135
'client.c',
136
'common.c',
137
+))
138
+blockdev_ss.add(files(
139
'server.c',
140
))
591
diff --git a/stubs/meson.build b/stubs/meson.build
141
diff --git a/stubs/meson.build b/stubs/meson.build
592
index XXXXXXX..XXXXXXX 100644
142
index XXXXXXX..XXXXXXX 100644
593
--- a/stubs/meson.build
143
--- a/stubs/meson.build
594
+++ b/stubs/meson.build
144
+++ b/stubs/meson.build
595
@@ -XXX,XX +XXX,XX @@ if have_system
145
@@ -XXX,XX +XXX,XX @@
596
else
146
stub_ss.add(files('arch_type.c'))
597
stub_ss.add(files('qdev.c'))
147
stub_ss.add(files('bdrv-next-monitor-owned.c'))
598
endif
148
stub_ss.add(files('blk-commit-all.c'))
599
+stub_ss.add(when: 'CONFIG_VFIO_USER_SERVER', if_false: files('vfio-user-obj.c'))
149
+stub_ss.add(files('blk-exp-close-all.c'))
150
stub_ss.add(files('blockdev-close-all-bdrv-states.c'))
151
stub_ss.add(files('change-state-handler.c'))
152
stub_ss.add(files('cmos.c'))
600
--
153
--
601
2.36.1
154
2.26.2
155
diff view generated by jsdifflib
1
From: Jagannathan Raman <jag.raman@oracle.com>
1
Make it possible to specify the iothread where the export will run. By
2
default the block node can be moved to other AioContexts later and the
3
export will follow. The fixed-iothread option forces strict behavior
4
that prevents changing AioContext while the export is active. See the
5
QAPI docs for details.
2
6
3
Determine the BARs used by the PCI device and register handlers to
7
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
4
manage the access to the same.
8
Message-id: 20200929125516.186715-5-stefanha@redhat.com
5
9
[Fix stray '#' character in block-export.json and add missing "(since:
6
Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
10
5.2)" as suggested by Eric Blake.
7
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
11
--Stefan]
8
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
9
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
10
Message-id: 3373e10b5be5f42846f0632d4382466e1698c505.1655151679.git.jag.raman@oracle.com
11
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
12
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
12
---
13
---
13
include/exec/memory.h | 3 +
14
qapi/block-export.json | 11 ++++++++++
14
hw/remote/vfio-user-obj.c | 190 ++++++++++++++++++++++++++++++++
15
block/export/export.c | 31 +++++++++++++++++++++++++++-
15
softmmu/physmem.c | 4 +-
16
block/export/vhost-user-blk-server.c | 5 ++++-
16
tests/qtest/fuzz/generic_fuzz.c | 9 +-
17
nbd/server.c | 2 --
17
hw/remote/trace-events | 3 +
18
4 files changed, 45 insertions(+), 4 deletions(-)
18
5 files changed, 203 insertions(+), 6 deletions(-)
19
19
20
diff --git a/include/exec/memory.h b/include/exec/memory.h
20
diff --git a/qapi/block-export.json b/qapi/block-export.json
21
index XXXXXXX..XXXXXXX 100644
21
index XXXXXXX..XXXXXXX 100644
22
--- a/include/exec/memory.h
22
--- a/qapi/block-export.json
23
+++ b/include/exec/memory.h
23
+++ b/qapi/block-export.json
24
@@ -XXX,XX +XXX,XX @@ MemTxResult address_space_write_cached_slow(MemoryRegionCache *cache,
24
@@ -XXX,XX +XXX,XX @@
25
hwaddr addr, const void *buf,
25
# export before completion is signalled. (since: 5.2;
26
hwaddr len);
26
# default: false)
27
27
#
28
+int memory_access_size(MemoryRegion *mr, unsigned l, hwaddr addr);
28
+# @iothread: The name of the iothread object where the export will run. The
29
+bool prepare_mmio_access(MemoryRegion *mr);
29
+# default is to use the thread currently associated with the
30
+# block node. (since: 5.2)
31
+#
32
+# @fixed-iothread: True prevents the block node from being moved to another
33
+# thread while the export is active. If true and @iothread is
34
+# given, export creation fails if the block node cannot be
35
+# moved to the iothread. The default is false. (since: 5.2)
36
+#
37
# Since: 4.2
38
##
39
{ 'union': 'BlockExportOptions',
40
'base': { 'type': 'BlockExportType',
41
'id': 'str',
42
+     '*fixed-iothread': 'bool',
43
+     '*iothread': 'str',
44
'node-name': 'str',
45
'*writable': 'bool',
46
'*writethrough': 'bool' },
47
diff --git a/block/export/export.c b/block/export/export.c
48
index XXXXXXX..XXXXXXX 100644
49
--- a/block/export/export.c
50
+++ b/block/export/export.c
51
@@ -XXX,XX +XXX,XX @@
52
53
#include "block/block.h"
54
#include "sysemu/block-backend.h"
55
+#include "sysemu/iothread.h"
56
#include "block/export.h"
57
#include "block/nbd.h"
58
#include "qapi/error.h"
59
@@ -XXX,XX +XXX,XX @@ static const BlockExportDriver *blk_exp_find_driver(BlockExportType type)
60
61
BlockExport *blk_exp_add(BlockExportOptions *export, Error **errp)
62
{
63
+ bool fixed_iothread = export->has_fixed_iothread && export->fixed_iothread;
64
const BlockExportDriver *drv;
65
BlockExport *exp = NULL;
66
BlockDriverState *bs;
67
- BlockBackend *blk;
68
+ BlockBackend *blk = NULL;
69
AioContext *ctx;
70
uint64_t perm;
71
int ret;
72
@@ -XXX,XX +XXX,XX @@ BlockExport *blk_exp_add(BlockExportOptions *export, Error **errp)
73
ctx = bdrv_get_aio_context(bs);
74
aio_context_acquire(ctx);
75
76
+ if (export->has_iothread) {
77
+ IOThread *iothread;
78
+ AioContext *new_ctx;
30
+
79
+
31
static inline bool memory_access_is_direct(MemoryRegion *mr, bool is_write)
80
+ iothread = iothread_by_id(export->iothread);
32
{
81
+ if (!iothread) {
33
if (is_write) {
82
+ error_setg(errp, "iothread \"%s\" not found", export->iothread);
34
diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c
83
+ goto fail;
35
index XXXXXXX..XXXXXXX 100644
36
--- a/hw/remote/vfio-user-obj.c
37
+++ b/hw/remote/vfio-user-obj.c
38
@@ -XXX,XX +XXX,XX @@
39
#include "hw/qdev-core.h"
40
#include "hw/pci/pci.h"
41
#include "qemu/timer.h"
42
+#include "exec/memory.h"
43
44
#define TYPE_VFU_OBJECT "x-vfio-user-server"
45
OBJECT_DECLARE_TYPE(VfuObject, VfuObjectClass, VFU_OBJECT)
46
@@ -XXX,XX +XXX,XX @@ static void dma_unregister(vfu_ctx_t *vfu_ctx, vfu_dma_info_t *info)
47
trace_vfu_dma_unregister((uint64_t)info->iova.iov_base);
48
}
49
50
+static int vfu_object_mr_rw(MemoryRegion *mr, uint8_t *buf, hwaddr offset,
51
+ hwaddr size, const bool is_write)
52
+{
53
+ uint8_t *ptr = buf;
54
+ bool release_lock = false;
55
+ uint8_t *ram_ptr = NULL;
56
+ MemTxResult result;
57
+ int access_size;
58
+ uint64_t val;
59
+
60
+ if (memory_access_is_direct(mr, is_write)) {
61
+ /**
62
+ * Some devices expose a PCI expansion ROM, which could be buffer
63
+ * based as compared to other regions which are primarily based on
64
+ * MemoryRegionOps. memory_region_find() would already check
65
+ * for buffer overflow, we don't need to repeat it here.
66
+ */
67
+ ram_ptr = memory_region_get_ram_ptr(mr);
68
+
69
+ if (is_write) {
70
+ memcpy((ram_ptr + offset), buf, size);
71
+ } else {
72
+ memcpy(buf, (ram_ptr + offset), size);
73
+ }
84
+ }
74
+
85
+
75
+ return 0;
86
+ new_ctx = iothread_get_aio_context(iothread);
87
+
88
+ ret = bdrv_try_set_aio_context(bs, new_ctx, errp);
89
+ if (ret == 0) {
90
+ aio_context_release(ctx);
91
+ aio_context_acquire(new_ctx);
92
+ ctx = new_ctx;
93
+ } else if (fixed_iothread) {
94
+ goto fail;
95
+ }
76
+ }
96
+ }
77
+
97
+
78
+ while (size) {
98
/*
79
+ /**
99
* Block exports are used for non-shared storage migration. Make sure
80
+ * The read/write logic used below is similar to the ones in
100
* that BDRV_O_INACTIVE is cleared and the image is ready for write
81
+ * flatview_read/write_continue()
101
@@ -XXX,XX +XXX,XX @@ BlockExport *blk_exp_add(BlockExportOptions *export, Error **errp)
82
+ */
102
}
83
+ release_lock = prepare_mmio_access(mr);
103
104
blk = blk_new(ctx, perm, BLK_PERM_ALL);
84
+
105
+
85
+ access_size = memory_access_size(mr, size, offset);
106
+ if (!fixed_iothread) {
86
+
107
+ blk_set_allow_aio_context_change(blk, true);
87
+ if (is_write) {
88
+ val = ldn_he_p(ptr, access_size);
89
+
90
+ result = memory_region_dispatch_write(mr, offset, val,
91
+ size_memop(access_size),
92
+ MEMTXATTRS_UNSPECIFIED);
93
+ } else {
94
+ result = memory_region_dispatch_read(mr, offset, &val,
95
+ size_memop(access_size),
96
+ MEMTXATTRS_UNSPECIFIED);
97
+
98
+ stn_he_p(ptr, access_size, val);
99
+ }
100
+
101
+ if (release_lock) {
102
+ qemu_mutex_unlock_iothread();
103
+ release_lock = false;
104
+ }
105
+
106
+ if (result != MEMTX_OK) {
107
+ return -1;
108
+ }
109
+
110
+ size -= access_size;
111
+ ptr += access_size;
112
+ offset += access_size;
113
+ }
108
+ }
114
+
109
+
115
+ return 0;
110
ret = blk_insert_bs(blk, bs, errp);
116
+}
111
if (ret < 0) {
112
goto fail;
113
diff --git a/block/export/vhost-user-blk-server.c b/block/export/vhost-user-blk-server.c
114
index XXXXXXX..XXXXXXX 100644
115
--- a/block/export/vhost-user-blk-server.c
116
+++ b/block/export/vhost-user-blk-server.c
117
@@ -XXX,XX +XXX,XX @@ static const VuDevIface vu_blk_iface = {
118
static void blk_aio_attached(AioContext *ctx, void *opaque)
119
{
120
VuBlkExport *vexp = opaque;
117
+
121
+
118
+static size_t vfu_object_bar_rw(PCIDevice *pci_dev, int pci_bar,
122
+ vexp->export.ctx = ctx;
119
+ hwaddr bar_offset, char * const buf,
123
vhost_user_server_attach_aio_context(&vexp->vu_server, ctx);
120
+ hwaddr len, const bool is_write)
124
}
121
+{
125
122
+ MemoryRegionSection section = { 0 };
126
static void blk_aio_detach(void *opaque)
123
+ uint8_t *ptr = (uint8_t *)buf;
127
{
124
+ MemoryRegion *section_mr = NULL;
128
VuBlkExport *vexp = opaque;
125
+ uint64_t section_size;
126
+ hwaddr section_offset;
127
+ hwaddr size = 0;
128
+
129
+
129
+ while (len) {
130
vhost_user_server_detach_aio_context(&vexp->vu_server);
130
+ section = memory_region_find(pci_dev->io_regions[pci_bar].memory,
131
+ vexp->export.ctx = NULL;
131
+ bar_offset, len);
132
}
132
+
133
133
+ if (!section.mr) {
134
static void
134
+ warn_report("vfu: invalid address 0x%"PRIx64"", bar_offset);
135
@@ -XXX,XX +XXX,XX @@ static int vu_blk_exp_create(BlockExport *exp, BlockExportOptions *opts,
135
+ return size;
136
vu_blk_initialize_config(blk_bs(exp->blk), &vexp->blkcfg,
136
+ }
137
logical_block_size);
137
+
138
138
+ section_mr = section.mr;
139
- blk_set_allow_aio_context_change(exp->blk, true);
139
+ section_offset = section.offset_within_region;
140
blk_add_aio_context_notifier(exp->blk, blk_aio_attached, blk_aio_detach,
140
+ section_size = int128_get64(section.size);
141
vexp);
141
+
142
142
+ if (is_write && section_mr->readonly) {
143
diff --git a/nbd/server.c b/nbd/server.c
143
+ warn_report("vfu: attempting to write to readonly region in "
144
index XXXXXXX..XXXXXXX 100644
144
+ "bar %d - [0x%"PRIx64" - 0x%"PRIx64"]",
145
--- a/nbd/server.c
145
+ pci_bar, bar_offset,
146
+++ b/nbd/server.c
146
+ (bar_offset + section_size));
147
@@ -XXX,XX +XXX,XX @@ static int nbd_export_create(BlockExport *blk_exp, BlockExportOptions *exp_args,
147
+ memory_region_unref(section_mr);
148
return ret;
148
+ return size;
149
+ }
150
+
151
+ if (vfu_object_mr_rw(section_mr, ptr, section_offset,
152
+ section_size, is_write)) {
153
+ warn_report("vfu: failed to %s "
154
+ "[0x%"PRIx64" - 0x%"PRIx64"] in bar %d",
155
+ is_write ? "write to" : "read from", bar_offset,
156
+ (bar_offset + section_size), pci_bar);
157
+ memory_region_unref(section_mr);
158
+ return size;
159
+ }
160
+
161
+ size += section_size;
162
+ bar_offset += section_size;
163
+ ptr += section_size;
164
+ len -= section_size;
165
+
166
+ memory_region_unref(section_mr);
167
+ }
168
+
169
+ return size;
170
+}
171
+
172
+/**
173
+ * VFU_OBJECT_BAR_HANDLER - macro for defining handlers for PCI BARs.
174
+ *
175
+ * To create handler for BAR number 2, VFU_OBJECT_BAR_HANDLER(2) would
176
+ * define vfu_object_bar2_handler
177
+ */
178
+#define VFU_OBJECT_BAR_HANDLER(BAR_NO) \
179
+ static ssize_t vfu_object_bar##BAR_NO##_handler(vfu_ctx_t *vfu_ctx, \
180
+ char * const buf, size_t count, \
181
+ loff_t offset, const bool is_write) \
182
+ { \
183
+ VfuObject *o = vfu_get_private(vfu_ctx); \
184
+ PCIDevice *pci_dev = o->pci_dev; \
185
+ \
186
+ return vfu_object_bar_rw(pci_dev, BAR_NO, offset, \
187
+ buf, count, is_write); \
188
+ } \
189
+
190
+VFU_OBJECT_BAR_HANDLER(0)
191
+VFU_OBJECT_BAR_HANDLER(1)
192
+VFU_OBJECT_BAR_HANDLER(2)
193
+VFU_OBJECT_BAR_HANDLER(3)
194
+VFU_OBJECT_BAR_HANDLER(4)
195
+VFU_OBJECT_BAR_HANDLER(5)
196
+VFU_OBJECT_BAR_HANDLER(6)
197
+
198
+static vfu_region_access_cb_t *vfu_object_bar_handlers[PCI_NUM_REGIONS] = {
199
+ &vfu_object_bar0_handler,
200
+ &vfu_object_bar1_handler,
201
+ &vfu_object_bar2_handler,
202
+ &vfu_object_bar3_handler,
203
+ &vfu_object_bar4_handler,
204
+ &vfu_object_bar5_handler,
205
+ &vfu_object_bar6_handler,
206
+};
207
+
208
+/**
209
+ * vfu_object_register_bars - Identify active BAR regions of pdev and setup
210
+ * callbacks to handle read/write accesses
211
+ */
212
+static void vfu_object_register_bars(vfu_ctx_t *vfu_ctx, PCIDevice *pdev)
213
+{
214
+ int flags = VFU_REGION_FLAG_RW;
215
+ int i;
216
+
217
+ for (i = 0; i < PCI_NUM_REGIONS; i++) {
218
+ if (!pdev->io_regions[i].size) {
219
+ continue;
220
+ }
221
+
222
+ if ((i == VFU_PCI_DEV_ROM_REGION_IDX) ||
223
+ pdev->io_regions[i].memory->readonly) {
224
+ flags &= ~VFU_REGION_FLAG_WRITE;
225
+ }
226
+
227
+ vfu_setup_region(vfu_ctx, VFU_PCI_DEV_BAR0_REGION_IDX + i,
228
+ (size_t)pdev->io_regions[i].size,
229
+ vfu_object_bar_handlers[i],
230
+ flags, NULL, 0, -1, 0);
231
+
232
+ trace_vfu_bar_register(i, pdev->io_regions[i].addr,
233
+ pdev->io_regions[i].size);
234
+ }
235
+}
236
+
237
/*
238
* TYPE_VFU_OBJECT depends on the availability of the 'socket' and 'device'
239
* properties. It also depends on devices instantiated in QEMU. These
240
@@ -XXX,XX +XXX,XX @@ static void vfu_object_init_ctx(VfuObject *o, Error **errp)
241
goto fail;
242
}
149
}
243
150
244
+ vfu_object_register_bars(o->vfu_ctx, o->pci_dev);
151
- blk_set_allow_aio_context_change(blk, true);
245
+
152
-
246
ret = vfu_realize_ctx(o->vfu_ctx);
153
QTAILQ_INIT(&exp->clients);
247
if (ret < 0) {
154
exp->name = g_strdup(arg->name);
248
error_setg(errp, "vfu: Failed to realize device %s- %s",
155
exp->description = g_strdup(arg->description);
249
diff --git a/softmmu/physmem.c b/softmmu/physmem.c
250
index XXXXXXX..XXXXXXX 100644
251
--- a/softmmu/physmem.c
252
+++ b/softmmu/physmem.c
253
@@ -XXX,XX +XXX,XX @@ void memory_region_flush_rom_device(MemoryRegion *mr, hwaddr addr, hwaddr size)
254
invalidate_and_set_dirty(mr, addr, size);
255
}
256
257
-static int memory_access_size(MemoryRegion *mr, unsigned l, hwaddr addr)
258
+int memory_access_size(MemoryRegion *mr, unsigned l, hwaddr addr)
259
{
260
unsigned access_size_max = mr->ops->valid.max_access_size;
261
262
@@ -XXX,XX +XXX,XX @@ static int memory_access_size(MemoryRegion *mr, unsigned l, hwaddr addr)
263
return l;
264
}
265
266
-static bool prepare_mmio_access(MemoryRegion *mr)
267
+bool prepare_mmio_access(MemoryRegion *mr)
268
{
269
bool release_lock = false;
270
271
diff --git a/tests/qtest/fuzz/generic_fuzz.c b/tests/qtest/fuzz/generic_fuzz.c
272
index XXXXXXX..XXXXXXX 100644
273
--- a/tests/qtest/fuzz/generic_fuzz.c
274
+++ b/tests/qtest/fuzz/generic_fuzz.c
275
@@ -XXX,XX +XXX,XX @@ static void *pattern_alloc(pattern p, size_t len)
276
return buf;
277
}
278
279
-static int memory_access_size(MemoryRegion *mr, unsigned l, hwaddr addr)
280
+static int fuzz_memory_access_size(MemoryRegion *mr, unsigned l, hwaddr addr)
281
{
282
unsigned access_size_max = mr->ops->valid.max_access_size;
283
284
@@ -XXX,XX +XXX,XX @@ void fuzz_dma_read_cb(size_t addr, size_t len, MemoryRegion *mr)
285
286
/*
287
* If mr1 isn't RAM, address_space_translate doesn't update l. Use
288
- * memory_access_size to identify the number of bytes that it is safe
289
- * to write without accidentally writing to another MemoryRegion.
290
+ * fuzz_memory_access_size to identify the number of bytes that it
291
+ * is safe to write without accidentally writing to another
292
+ * MemoryRegion.
293
*/
294
if (!memory_region_is_ram(mr1)) {
295
- l = memory_access_size(mr1, l, addr1);
296
+ l = fuzz_memory_access_size(mr1, l, addr1);
297
}
298
if (memory_region_is_ram(mr1) ||
299
memory_region_is_romd(mr1) ||
300
diff --git a/hw/remote/trace-events b/hw/remote/trace-events
301
index XXXXXXX..XXXXXXX 100644
302
--- a/hw/remote/trace-events
303
+++ b/hw/remote/trace-events
304
@@ -XXX,XX +XXX,XX @@ vfu_cfg_read(uint32_t offset, uint32_t val) "vfu: cfg: 0x%u -> 0x%x"
305
vfu_cfg_write(uint32_t offset, uint32_t val) "vfu: cfg: 0x%u <- 0x%x"
306
vfu_dma_register(uint64_t gpa, size_t len) "vfu: registering GPA 0x%"PRIx64", %zu bytes"
307
vfu_dma_unregister(uint64_t gpa) "vfu: unregistering GPA 0x%"PRIx64""
308
+vfu_bar_register(int i, uint64_t addr, uint64_t size) "vfu: BAR %d: addr 0x%"PRIx64" size 0x%"PRIx64""
309
+vfu_bar_rw_enter(const char *op, uint64_t addr) "vfu: %s request for BAR address 0x%"PRIx64""
310
+vfu_bar_rw_exit(const char *op, uint64_t addr) "vfu: Finished %s of BAR address 0x%"PRIx64""
311
--
156
--
312
2.36.1
157
2.26.2
158
diff view generated by jsdifflib
1
Every laio_io_plug() call has a matching laio_io_unplug() call. There is
1
Allow the number of queues to be configured using --export
2
a plugged counter that tracks the number of levels of plugging and
2
vhost-user-blk,num-queues=N. This setting should match the QEMU --device
3
allows for nesting.
3
vhost-user-blk-pci,num-queues=N setting but QEMU vhost-user-blk.c lowers
4
its own value if the vhost-user-blk backend offers fewer queues than
5
QEMU.
4
6
5
The plugged counter must reflect the balance between laio_io_plug() and
7
The vhost-user-blk-server.c code is already capable of multi-queue. All
6
laio_io_unplug() calls accurately. Otherwise I/O stalls occur since
8
virtqueue processing runs in the same AioContext. No new locking is
7
io_submit(2) calls are skipped while plugged.
9
needed.
8
10
9
Reported-by: Nikolay Tenev <nt@storpool.com>
11
Add the num-queues=N option and set the VIRTIO_BLK_F_MQ feature bit.
12
Note that the feature bit only announces the presence of the num_queues
13
configuration space field. It does not promise that there is more than 1
14
virtqueue, so we can set it unconditionally.
15
16
I tested multi-queue by running a random read fio test with numjobs=4 on
17
an -smp 4 guest. After the benchmark finished the guest /proc/interrupts
18
file showed activity on all 4 virtio-blk MSI-X. The /sys/block/vda/mq/
19
directory shows that Linux blk-mq has 4 queues configured.
20
21
An automated test is included in the next commit.
22
10
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
23
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
11
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
24
Acked-by: Markus Armbruster <armbru@redhat.com>
12
Message-id: 20220609164712.1539045-2-stefanha@redhat.com
25
Message-id: 20201001144604.559733-2-stefanha@redhat.com
13
Cc: Stefano Garzarella <sgarzare@redhat.com>
26
[Fixed accidental tab characters as suggested by Markus Armbruster
14
Fixes: 68d7946648 ("linux-aio: add `dev_max_batch` parameter to laio_io_unplug()")
15
[Stefano Garzarella suggested adding a Fixes tag.
16
--Stefan]
27
--Stefan]
17
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
28
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
18
---
29
---
19
block/linux-aio.c | 4 +++-
30
qapi/block-export.json | 10 +++++++---
20
1 file changed, 3 insertions(+), 1 deletion(-)
31
block/export/vhost-user-blk-server.c | 24 ++++++++++++++++++------
32
2 files changed, 25 insertions(+), 9 deletions(-)
21
33
22
diff --git a/block/linux-aio.c b/block/linux-aio.c
34
diff --git a/qapi/block-export.json b/qapi/block-export.json
23
index XXXXXXX..XXXXXXX 100644
35
index XXXXXXX..XXXXXXX 100644
24
--- a/block/linux-aio.c
36
--- a/qapi/block-export.json
25
+++ b/block/linux-aio.c
37
+++ b/qapi/block-export.json
26
@@ -XXX,XX +XXX,XX @@ void laio_io_unplug(BlockDriverState *bs, LinuxAioState *s,
38
@@ -XXX,XX +XXX,XX @@
27
uint64_t dev_max_batch)
39
# SocketAddress types are supported. Passed fds must be UNIX domain
40
# sockets.
41
# @logical-block-size: Logical block size in bytes. Defaults to 512 bytes.
42
+# @num-queues: Number of request virtqueues. Must be greater than 0. Defaults
43
+# to 1.
44
#
45
# Since: 5.2
46
##
47
{ 'struct': 'BlockExportOptionsVhostUserBlk',
48
- 'data': { 'addr': 'SocketAddress', '*logical-block-size': 'size' } }
49
+ 'data': { 'addr': 'SocketAddress',
50
+     '*logical-block-size': 'size',
51
+ '*num-queues': 'uint16'} }
52
53
##
54
# @NbdServerAddOptions:
55
@@ -XXX,XX +XXX,XX @@
56
{ 'union': 'BlockExportOptions',
57
'base': { 'type': 'BlockExportType',
58
'id': 'str',
59
-     '*fixed-iothread': 'bool',
60
-     '*iothread': 'str',
61
+ '*fixed-iothread': 'bool',
62
+ '*iothread': 'str',
63
'node-name': 'str',
64
'*writable': 'bool',
65
'*writethrough': 'bool' },
66
diff --git a/block/export/vhost-user-blk-server.c b/block/export/vhost-user-blk-server.c
67
index XXXXXXX..XXXXXXX 100644
68
--- a/block/export/vhost-user-blk-server.c
69
+++ b/block/export/vhost-user-blk-server.c
70
@@ -XXX,XX +XXX,XX @@
71
#include "util/block-helpers.h"
72
73
enum {
74
- VHOST_USER_BLK_MAX_QUEUES = 1,
75
+ VHOST_USER_BLK_NUM_QUEUES_DEFAULT = 1,
76
};
77
struct virtio_blk_inhdr {
78
unsigned char status;
79
@@ -XXX,XX +XXX,XX @@ static uint64_t vu_blk_get_features(VuDev *dev)
80
1ull << VIRTIO_BLK_F_DISCARD |
81
1ull << VIRTIO_BLK_F_WRITE_ZEROES |
82
1ull << VIRTIO_BLK_F_CONFIG_WCE |
83
+ 1ull << VIRTIO_BLK_F_MQ |
84
1ull << VIRTIO_F_VERSION_1 |
85
1ull << VIRTIO_RING_F_INDIRECT_DESC |
86
1ull << VIRTIO_RING_F_EVENT_IDX |
87
@@ -XXX,XX +XXX,XX @@ static void blk_aio_detach(void *opaque)
88
89
static void
90
vu_blk_initialize_config(BlockDriverState *bs,
91
- struct virtio_blk_config *config, uint32_t blk_size)
92
+ struct virtio_blk_config *config,
93
+ uint32_t blk_size,
94
+ uint16_t num_queues)
28
{
95
{
29
assert(s->io_q.plugged);
96
config->capacity = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
30
+ s->io_q.plugged--;
97
config->blk_size = blk_size;
98
@@ -XXX,XX +XXX,XX @@ vu_blk_initialize_config(BlockDriverState *bs,
99
config->seg_max = 128 - 2;
100
config->min_io_size = 1;
101
config->opt_io_size = 1;
102
- config->num_queues = VHOST_USER_BLK_MAX_QUEUES;
103
+ config->num_queues = num_queues;
104
config->max_discard_sectors = 32768;
105
config->max_discard_seg = 1;
106
config->discard_sector_alignment = config->blk_size >> 9;
107
@@ -XXX,XX +XXX,XX @@ static int vu_blk_exp_create(BlockExport *exp, BlockExportOptions *opts,
108
BlockExportOptionsVhostUserBlk *vu_opts = &opts->u.vhost_user_blk;
109
Error *local_err = NULL;
110
uint64_t logical_block_size;
111
+ uint16_t num_queues = VHOST_USER_BLK_NUM_QUEUES_DEFAULT;
112
113
vexp->writable = opts->writable;
114
vexp->blkcfg.wce = 0;
115
@@ -XXX,XX +XXX,XX @@ static int vu_blk_exp_create(BlockExport *exp, BlockExportOptions *opts,
116
}
117
vexp->blk_size = logical_block_size;
118
blk_set_guest_block_size(exp->blk, logical_block_size);
31
+
119
+
32
if (s->io_q.in_queue >= laio_max_batch(s, dev_max_batch) ||
120
+ if (vu_opts->has_num_queues) {
33
- (--s->io_q.plugged == 0 &&
121
+ num_queues = vu_opts->num_queues;
34
+ (!s->io_q.plugged &&
122
+ }
35
!s->io_q.blocked && !QSIMPLEQ_EMPTY(&s->io_q.pending))) {
123
+ if (num_queues == 0) {
36
ioq_submit(s);
124
+ error_setg(errp, "num-queues must be greater than 0");
37
}
125
+ return -EINVAL;
126
+ }
127
+
128
vu_blk_initialize_config(blk_bs(exp->blk), &vexp->blkcfg,
129
- logical_block_size);
130
+ logical_block_size, num_queues);
131
132
blk_add_aio_context_notifier(exp->blk, blk_aio_attached, blk_aio_detach,
133
vexp);
134
135
if (!vhost_user_server_start(&vexp->vu_server, vu_opts->addr, exp->ctx,
136
- VHOST_USER_BLK_MAX_QUEUES, &vu_blk_iface,
137
- errp)) {
138
+ num_queues, &vu_blk_iface, errp)) {
139
blk_remove_aio_context_notifier(exp->blk, blk_aio_attached,
140
blk_aio_detach, vexp);
141
return -EADDRNOTAVAIL;
38
--
142
--
39
2.36.1
143
2.26.2
144
diff view generated by jsdifflib
1
From: Jagannathan Raman <jag.raman@oracle.com>
1
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2
2
3
create a context with the vfio-user library to run a PCI device
3
bdrv_co_block_status_above has several design problems with handling
4
short backing files:
4
5
5
Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
6
1. With want_zeros=true, it may return ret with BDRV_BLOCK_ZERO but
6
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
7
without BDRV_BLOCK_ALLOCATED flag, when actually short backing file
7
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
8
which produces these after-EOF zeros is inside requested backing
8
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
9
sequence.
9
Message-id: a452871ac8c812ff96fc4f0ce6037f4769953fab.1655151679.git.jag.raman@oracle.com
10
11
2. With want_zero=false, it may return pnum=0 prior to actual EOF,
12
because of EOF of short backing file.
13
14
Fix these things, making logic about short backing files clearer.
15
16
With fixed bdrv_block_status_above we also have to improve is_zero in
17
qcow2 code, otherwise iotest 154 will fail, because with this patch we
18
stop to merge zeros of different types (produced by fully unallocated
19
in the whole backing chain regions vs produced by short backing files).
20
21
Note also, that this patch leaves for another day the general problem
22
around block-status: misuse of BDRV_BLOCK_ALLOCATED as is-fs-allocated
23
vs go-to-backing.
24
25
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
26
Reviewed-by: Alberto Garcia <berto@igalia.com>
27
Reviewed-by: Eric Blake <eblake@redhat.com>
28
Message-id: 20200924194003.22080-2-vsementsov@virtuozzo.com
29
[Fix s/comes/come/ as suggested by Eric Blake
30
--Stefan]
10
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
31
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
11
---
32
---
12
hw/remote/vfio-user-obj.c | 82 +++++++++++++++++++++++++++++++++++++++
33
block/io.c | 68 ++++++++++++++++++++++++++++++++++++++++-----------
13
1 file changed, 82 insertions(+)
34
block/qcow2.c | 16 ++++++++++--
35
2 files changed, 68 insertions(+), 16 deletions(-)
14
36
15
diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c
37
diff --git a/block/io.c b/block/io.c
16
index XXXXXXX..XXXXXXX 100644
38
index XXXXXXX..XXXXXXX 100644
17
--- a/hw/remote/vfio-user-obj.c
39
--- a/block/io.c
18
+++ b/hw/remote/vfio-user-obj.c
40
+++ b/block/io.c
19
@@ -XXX,XX +XXX,XX @@
41
@@ -XXX,XX +XXX,XX @@ bdrv_co_common_block_status_above(BlockDriverState *bs,
20
#include "hw/remote/machine.h"
42
int64_t *map,
21
#include "qapi/error.h"
43
BlockDriverState **file)
22
#include "qapi/qapi-visit-sockets.h"
44
{
23
+#include "qemu/notify.h"
45
+ int ret;
24
+#include "sysemu/sysemu.h"
46
BlockDriverState *p;
25
+#include "libvfio-user.h"
47
- int ret = 0;
26
48
- bool first = true;
27
#define TYPE_VFU_OBJECT "x-vfio-user-server"
49
+ int64_t eof = 0;
28
OBJECT_DECLARE_TYPE(VfuObject, VfuObjectClass, VFU_OBJECT)
50
29
@@ -XXX,XX +XXX,XX @@ struct VfuObject {
51
assert(bs != base);
30
char *device;
52
- for (p = bs; p != base; p = bdrv_filter_or_cow_bs(p)) {
31
32
Error *err;
33
+
53
+
34
+ Notifier machine_done;
54
+ ret = bdrv_co_block_status(bs, want_zero, offset, bytes, pnum, map, file);
35
+
55
+ if (ret < 0 || *pnum == 0 || ret & BDRV_BLOCK_ALLOCATED) {
36
+ vfu_ctx_t *vfu_ctx;
56
+ return ret;
37
};
38
39
+static void vfu_object_init_ctx(VfuObject *o, Error **errp);
40
+
41
static bool vfu_object_auto_shutdown(void)
42
{
43
bool auto_shutdown = true;
44
@@ -XXX,XX +XXX,XX @@ static void vfu_object_set_socket(Object *obj, Visitor *v, const char *name,
45
{
46
VfuObject *o = VFU_OBJECT(obj);
47
48
+ if (o->vfu_ctx) {
49
+ error_setg(errp, "vfu: Unable to set socket property - server busy");
50
+ return;
51
+ }
57
+ }
52
+
58
+
53
qapi_free_SocketAddress(o->socket);
59
+ if (ret & BDRV_BLOCK_EOF) {
54
60
+ eof = offset + *pnum;
55
o->socket = NULL;
56
@@ -XXX,XX +XXX,XX @@ static void vfu_object_set_socket(Object *obj, Visitor *v, const char *name,
57
}
58
59
trace_vfu_prop("socket", o->socket->u.q_unix.path);
60
+
61
+ vfu_object_init_ctx(o, errp);
62
}
63
64
static void vfu_object_set_device(Object *obj, const char *str, Error **errp)
65
{
66
VfuObject *o = VFU_OBJECT(obj);
67
68
+ if (o->vfu_ctx) {
69
+ error_setg(errp, "vfu: Unable to set device property - server busy");
70
+ return;
71
+ }
61
+ }
72
+
62
+
73
g_free(o->device);
63
+ assert(*pnum <= bytes);
74
64
+ bytes = *pnum;
75
o->device = g_strdup(str);
76
77
trace_vfu_prop("device", str);
78
+
65
+
79
+ vfu_object_init_ctx(o, errp);
66
+ for (p = bdrv_filter_or_cow_bs(bs); p != base;
80
+}
67
+ p = bdrv_filter_or_cow_bs(p))
68
+ {
69
ret = bdrv_co_block_status(p, want_zero, offset, bytes, pnum, map,
70
file);
71
if (ret < 0) {
72
- break;
73
+ return ret;
74
}
75
- if (ret & BDRV_BLOCK_ZERO && ret & BDRV_BLOCK_EOF && !first) {
76
+ if (*pnum == 0) {
77
/*
78
- * Reading beyond the end of the file continues to read
79
- * zeroes, but we can only widen the result to the
80
- * unallocated length we learned from an earlier
81
- * iteration.
82
+ * The top layer deferred to this layer, and because this layer is
83
+ * short, any zeroes that we synthesize beyond EOF behave as if they
84
+ * were allocated at this layer.
85
+ *
86
+ * We don't include BDRV_BLOCK_EOF into ret, as upper layer may be
87
+ * larger. We'll add BDRV_BLOCK_EOF if needed at function end, see
88
+ * below.
89
*/
90
+ assert(ret & BDRV_BLOCK_EOF);
91
*pnum = bytes;
92
+ if (file) {
93
+ *file = p;
94
+ }
95
+ ret = BDRV_BLOCK_ZERO | BDRV_BLOCK_ALLOCATED;
96
+ break;
97
}
98
- if (ret & (BDRV_BLOCK_ZERO | BDRV_BLOCK_DATA)) {
99
+ if (ret & BDRV_BLOCK_ALLOCATED) {
100
+ /*
101
+ * We've found the node and the status, we must break.
102
+ *
103
+ * Drop BDRV_BLOCK_EOF, as it's not for upper layer, which may be
104
+ * larger. We'll add BDRV_BLOCK_EOF if needed at function end, see
105
+ * below.
106
+ */
107
+ ret &= ~BDRV_BLOCK_EOF;
108
break;
109
}
110
- /* [offset, pnum] unallocated on this layer, which could be only
111
- * the first part of [offset, bytes]. */
112
- bytes = MIN(bytes, *pnum);
113
- first = false;
81
+
114
+
82
+/*
115
+ /*
83
+ * TYPE_VFU_OBJECT depends on the availability of the 'socket' and 'device'
116
+ * OK, [offset, offset + *pnum) region is unallocated on this layer,
84
+ * properties. It also depends on devices instantiated in QEMU. These
117
+ * let's continue the diving.
85
+ * dependencies are not available during the instance_init phase of this
118
+ */
86
+ * object's life-cycle. As such, the server is initialized after the
119
+ assert(*pnum <= bytes);
87
+ * machine is setup. machine_init_done_notifier notifies TYPE_VFU_OBJECT
120
+ bytes = *pnum;
88
+ * when the machine is setup, and the dependencies are available.
89
+ */
90
+static void vfu_object_machine_done(Notifier *notifier, void *data)
91
+{
92
+ VfuObject *o = container_of(notifier, VfuObject, machine_done);
93
+ Error *err = NULL;
94
+
95
+ vfu_object_init_ctx(o, &err);
96
+
97
+ if (err) {
98
+ error_propagate(&error_abort, err);
99
+ }
100
+}
101
+
102
+static void vfu_object_init_ctx(VfuObject *o, Error **errp)
103
+{
104
+ ERRP_GUARD();
105
+
106
+ if (o->vfu_ctx || !o->socket || !o->device ||
107
+ !phase_check(PHASE_MACHINE_READY)) {
108
+ return;
109
+ }
121
+ }
110
+
122
+
111
+ if (o->err) {
123
+ if (offset + *pnum == eof) {
112
+ error_propagate(errp, o->err);
124
+ ret |= BDRV_BLOCK_EOF;
113
+ o->err = NULL;
114
+ return;
115
+ }
116
+
117
+ o->vfu_ctx = vfu_create_ctx(VFU_TRANS_SOCK, o->socket->u.q_unix.path, 0,
118
+ o, VFU_DEV_TYPE_PCI);
119
+ if (o->vfu_ctx == NULL) {
120
+ error_setg(errp, "vfu: Failed to create context - %s", strerror(errno));
121
+ return;
122
+ }
123
}
124
125
static void vfu_object_init(Object *obj)
126
@@ -XXX,XX +XXX,XX @@ static void vfu_object_init(Object *obj)
127
TYPE_VFU_OBJECT, TYPE_REMOTE_MACHINE);
128
return;
129
}
125
}
130
+
126
+
131
+ if (!phase_check(PHASE_MACHINE_READY)) {
127
return ret;
132
+ o->machine_done.notify = vfu_object_machine_done;
128
}
133
+ qemu_add_machine_init_done_notifier(&o->machine_done);
129
134
+ }
130
diff --git a/block/qcow2.c b/block/qcow2.c
131
index XXXXXXX..XXXXXXX 100644
132
--- a/block/qcow2.c
133
+++ b/block/qcow2.c
134
@@ -XXX,XX +XXX,XX @@ static bool is_zero(BlockDriverState *bs, int64_t offset, int64_t bytes)
135
if (!bytes) {
136
return true;
137
}
138
- res = bdrv_block_status_above(bs, NULL, offset, bytes, &nr, NULL, NULL);
139
- return res >= 0 && (res & BDRV_BLOCK_ZERO) && nr == bytes;
135
+
140
+
141
+ /*
142
+ * bdrv_block_status_above doesn't merge different types of zeros, for
143
+ * example, zeros which come from the region which is unallocated in
144
+ * the whole backing chain, and zeros which come because of a short
145
+ * backing file. So, we need a loop.
146
+ */
147
+ do {
148
+ res = bdrv_block_status_above(bs, NULL, offset, bytes, &nr, NULL, NULL);
149
+ offset += nr;
150
+ bytes -= nr;
151
+ } while (res >= 0 && (res & BDRV_BLOCK_ZERO) && nr && bytes);
152
+
153
+ return res >= 0 && (res & BDRV_BLOCK_ZERO) && bytes == 0;
136
}
154
}
137
155
138
static void vfu_object_finalize(Object *obj)
156
static coroutine_fn int qcow2_co_pwrite_zeroes(BlockDriverState *bs,
139
@@ -XXX,XX +XXX,XX @@ static void vfu_object_finalize(Object *obj)
140
141
o->socket = NULL;
142
143
+ if (o->vfu_ctx) {
144
+ vfu_destroy_ctx(o->vfu_ctx);
145
+ o->vfu_ctx = NULL;
146
+ }
147
+
148
g_free(o->device);
149
150
o->device = NULL;
151
@@ -XXX,XX +XXX,XX @@ static void vfu_object_finalize(Object *obj)
152
if (!k->nr_devs && vfu_object_auto_shutdown()) {
153
qemu_system_shutdown_request(SHUTDOWN_CAUSE_GUEST_SHUTDOWN);
154
}
155
+
156
+ if (o->machine_done.notify) {
157
+ qemu_remove_machine_init_done_notifier(&o->machine_done);
158
+ o->machine_done.notify = NULL;
159
+ }
160
}
161
162
static void vfu_object_class_init(ObjectClass *klass, void *data)
163
--
157
--
164
2.36.1
158
2.26.2
159
diff view generated by jsdifflib
1
From: Jagannathan Raman <jag.raman@oracle.com>
1
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2
2
3
Find the PCI device with specified id. Initialize the device context
3
In order to reuse bdrv_common_block_status_above in
4
with the QEMU PCI device
4
bdrv_is_allocated_above, let's support include_base parameter.
5
5
6
Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
6
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
7
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
7
Reviewed-by: Alberto Garcia <berto@igalia.com>
8
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
8
Reviewed-by: Eric Blake <eblake@redhat.com>
9
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
9
Message-id: 20200924194003.22080-3-vsementsov@virtuozzo.com
10
Message-id: 7798dbd730099b33fdd00c4c202cfe79e5c5c151.1655151679.git.jag.raman@oracle.com
11
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
12
---
11
---
13
hw/remote/vfio-user-obj.c | 67 +++++++++++++++++++++++++++++++++++++++
12
block/coroutines.h | 2 ++
14
1 file changed, 67 insertions(+)
13
block/io.c | 21 ++++++++++++++-------
14
2 files changed, 16 insertions(+), 7 deletions(-)
15
15
16
diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c
16
diff --git a/block/coroutines.h b/block/coroutines.h
17
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
18
--- a/hw/remote/vfio-user-obj.c
18
--- a/block/coroutines.h
19
+++ b/hw/remote/vfio-user-obj.c
19
+++ b/block/coroutines.h
20
@@ -XXX,XX +XXX,XX @@
20
@@ -XXX,XX +XXX,XX @@ bdrv_pwritev(BdrvChild *child, int64_t offset, unsigned int bytes,
21
#include "qemu/notify.h"
21
int coroutine_fn
22
#include "sysemu/sysemu.h"
22
bdrv_co_common_block_status_above(BlockDriverState *bs,
23
#include "libvfio-user.h"
23
BlockDriverState *base,
24
+#include "hw/qdev-core.h"
24
+ bool include_base,
25
+#include "hw/pci/pci.h"
25
bool want_zero,
26
26
int64_t offset,
27
#define TYPE_VFU_OBJECT "x-vfio-user-server"
27
int64_t bytes,
28
OBJECT_DECLARE_TYPE(VfuObject, VfuObjectClass, VFU_OBJECT)
28
@@ -XXX,XX +XXX,XX @@ bdrv_co_common_block_status_above(BlockDriverState *bs,
29
@@ -XXX,XX +XXX,XX @@ struct VfuObject {
29
int generated_co_wrapper
30
Notifier machine_done;
30
bdrv_common_block_status_above(BlockDriverState *bs,
31
31
BlockDriverState *base,
32
vfu_ctx_t *vfu_ctx;
32
+ bool include_base,
33
bool want_zero,
34
int64_t offset,
35
int64_t bytes,
36
diff --git a/block/io.c b/block/io.c
37
index XXXXXXX..XXXXXXX 100644
38
--- a/block/io.c
39
+++ b/block/io.c
40
@@ -XXX,XX +XXX,XX @@ early_out:
41
int coroutine_fn
42
bdrv_co_common_block_status_above(BlockDriverState *bs,
43
BlockDriverState *base,
44
+ bool include_base,
45
bool want_zero,
46
int64_t offset,
47
int64_t bytes,
48
@@ -XXX,XX +XXX,XX @@ bdrv_co_common_block_status_above(BlockDriverState *bs,
49
BlockDriverState *p;
50
int64_t eof = 0;
51
52
- assert(bs != base);
53
+ assert(include_base || bs != base);
54
+ assert(!include_base || base); /* Can't include NULL base */
55
56
ret = bdrv_co_block_status(bs, want_zero, offset, bytes, pnum, map, file);
57
- if (ret < 0 || *pnum == 0 || ret & BDRV_BLOCK_ALLOCATED) {
58
+ if (ret < 0 || *pnum == 0 || ret & BDRV_BLOCK_ALLOCATED || bs == base) {
59
return ret;
60
}
61
62
@@ -XXX,XX +XXX,XX @@ bdrv_co_common_block_status_above(BlockDriverState *bs,
63
assert(*pnum <= bytes);
64
bytes = *pnum;
65
66
- for (p = bdrv_filter_or_cow_bs(bs); p != base;
67
+ for (p = bdrv_filter_or_cow_bs(bs); include_base || p != base;
68
p = bdrv_filter_or_cow_bs(p))
69
{
70
ret = bdrv_co_block_status(p, want_zero, offset, bytes, pnum, map,
71
@@ -XXX,XX +XXX,XX @@ bdrv_co_common_block_status_above(BlockDriverState *bs,
72
break;
73
}
74
75
+ if (p == base) {
76
+ assert(include_base);
77
+ break;
78
+ }
33
+
79
+
34
+ PCIDevice *pci_dev;
80
/*
35
+
81
* OK, [offset, offset + *pnum) region is unallocated on this layer,
36
+ Error *unplug_blocker;
82
* let's continue the diving.
37
};
83
@@ -XXX,XX +XXX,XX @@ int bdrv_block_status_above(BlockDriverState *bs, BlockDriverState *base,
38
84
int64_t offset, int64_t bytes, int64_t *pnum,
39
static void vfu_object_init_ctx(VfuObject *o, Error **errp);
85
int64_t *map, BlockDriverState **file)
40
@@ -XXX,XX +XXX,XX @@ static void vfu_object_machine_done(Notifier *notifier, void *data)
41
static void vfu_object_init_ctx(VfuObject *o, Error **errp)
42
{
86
{
43
ERRP_GUARD();
87
- return bdrv_common_block_status_above(bs, base, true, offset, bytes,
44
+ DeviceState *dev = NULL;
88
+ return bdrv_common_block_status_above(bs, base, false, true, offset, bytes,
45
+ vfu_pci_type_t pci_type = VFU_PCI_TYPE_CONVENTIONAL;
89
pnum, map, file);
46
+ int ret;
47
48
if (o->vfu_ctx || !o->socket || !o->device ||
49
!phase_check(PHASE_MACHINE_READY)) {
50
@@ -XXX,XX +XXX,XX @@ static void vfu_object_init_ctx(VfuObject *o, Error **errp)
51
error_setg(errp, "vfu: Failed to create context - %s", strerror(errno));
52
return;
53
}
54
+
55
+ dev = qdev_find_recursive(sysbus_get_default(), o->device);
56
+ if (dev == NULL) {
57
+ error_setg(errp, "vfu: Device %s not found", o->device);
58
+ goto fail;
59
+ }
60
+
61
+ if (!object_dynamic_cast(OBJECT(dev), TYPE_PCI_DEVICE)) {
62
+ error_setg(errp, "vfu: %s not a PCI device", o->device);
63
+ goto fail;
64
+ }
65
+
66
+ o->pci_dev = PCI_DEVICE(dev);
67
+
68
+ object_ref(OBJECT(o->pci_dev));
69
+
70
+ if (pci_is_express(o->pci_dev)) {
71
+ pci_type = VFU_PCI_TYPE_EXPRESS;
72
+ }
73
+
74
+ ret = vfu_pci_init(o->vfu_ctx, pci_type, PCI_HEADER_TYPE_NORMAL, 0);
75
+ if (ret < 0) {
76
+ error_setg(errp,
77
+ "vfu: Failed to attach PCI device %s to context - %s",
78
+ o->device, strerror(errno));
79
+ goto fail;
80
+ }
81
+
82
+ error_setg(&o->unplug_blocker,
83
+ "vfu: %s for %s must be deleted before unplugging",
84
+ TYPE_VFU_OBJECT, o->device);
85
+ qdev_add_unplug_blocker(DEVICE(o->pci_dev), o->unplug_blocker);
86
+
87
+ return;
88
+
89
+fail:
90
+ vfu_destroy_ctx(o->vfu_ctx);
91
+ if (o->unplug_blocker && o->pci_dev) {
92
+ qdev_del_unplug_blocker(DEVICE(o->pci_dev), o->unplug_blocker);
93
+ error_free(o->unplug_blocker);
94
+ o->unplug_blocker = NULL;
95
+ }
96
+ if (o->pci_dev) {
97
+ object_unref(OBJECT(o->pci_dev));
98
+ o->pci_dev = NULL;
99
+ }
100
+ o->vfu_ctx = NULL;
101
}
90
}
102
91
103
static void vfu_object_init(Object *obj)
92
@@ -XXX,XX +XXX,XX @@ int coroutine_fn bdrv_is_allocated(BlockDriverState *bs, int64_t offset,
104
@@ -XXX,XX +XXX,XX @@ static void vfu_object_finalize(Object *obj)
93
int ret;
105
94
int64_t dummy;
106
o->device = NULL;
95
107
96
- ret = bdrv_common_block_status_above(bs, bdrv_filter_or_cow_bs(bs), false,
108
+ if (o->unplug_blocker && o->pci_dev) {
97
- offset, bytes, pnum ? pnum : &dummy,
109
+ qdev_del_unplug_blocker(DEVICE(o->pci_dev), o->unplug_blocker);
98
- NULL, NULL);
110
+ error_free(o->unplug_blocker);
99
+ ret = bdrv_common_block_status_above(bs, bs, true, false, offset,
111
+ o->unplug_blocker = NULL;
100
+ bytes, pnum ? pnum : &dummy, NULL,
112
+ }
101
+ NULL);
113
+
102
if (ret < 0) {
114
+ if (o->pci_dev) {
103
return ret;
115
+ object_unref(OBJECT(o->pci_dev));
116
+ o->pci_dev = NULL;
117
+ }
118
+
119
if (!k->nr_devs && vfu_object_auto_shutdown()) {
120
qemu_system_shutdown_request(SHUTDOWN_CAUSE_GUEST_SHUTDOWN);
121
}
104
}
122
--
105
--
123
2.36.1
106
2.26.2
107
diff view generated by jsdifflib
1
From: Jagannathan Raman <jag.raman@oracle.com>
1
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2
2
3
Adds handler to reset a remote device
3
We are going to reuse bdrv_common_block_status_above in
4
bdrv_is_allocated_above. bdrv_is_allocated_above may be called with
5
include_base == false and still bs == base (for ex. from img_rebase()).
4
6
5
Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
7
So, support this corner case.
6
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
8
7
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
9
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
8
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
10
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
9
Message-id: 112eeadf3bc4c6cdb100bc3f9a6fcfc20b467c1b.1655151679.git.jag.raman@oracle.com
11
Reviewed-by: Eric Blake <eblake@redhat.com>
12
Reviewed-by: Alberto Garcia <berto@igalia.com>
13
Message-id: 20200924194003.22080-4-vsementsov@virtuozzo.com
10
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
14
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
11
---
15
---
12
hw/remote/vfio-user-obj.c | 20 ++++++++++++++++++++
16
block/io.c | 6 +++++-
13
1 file changed, 20 insertions(+)
17
1 file changed, 5 insertions(+), 1 deletion(-)
14
18
15
diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c
19
diff --git a/block/io.c b/block/io.c
16
index XXXXXXX..XXXXXXX 100644
20
index XXXXXXX..XXXXXXX 100644
17
--- a/hw/remote/vfio-user-obj.c
21
--- a/block/io.c
18
+++ b/hw/remote/vfio-user-obj.c
22
+++ b/block/io.c
19
@@ -XXX,XX +XXX,XX @@ void vfu_object_set_bus_irq(PCIBus *pci_bus)
23
@@ -XXX,XX +XXX,XX @@ bdrv_co_common_block_status_above(BlockDriverState *bs,
20
max_bdf);
24
BlockDriverState *p;
21
}
25
int64_t eof = 0;
22
26
23
+static int vfu_object_device_reset(vfu_ctx_t *vfu_ctx, vfu_reset_type_t type)
27
- assert(include_base || bs != base);
24
+{
28
assert(!include_base || base); /* Can't include NULL base */
25
+ VfuObject *o = vfu_get_private(vfu_ctx);
29
26
+
30
+ if (!include_base && bs == base) {
27
+ /* vfu_object_ctx_run() handles lost connection */
31
+ *pnum = bytes;
28
+ if (type == VFU_RESET_LOST_CONN) {
29
+ return 0;
32
+ return 0;
30
+ }
33
+ }
31
+
34
+
32
+ qdev_reset_all(DEVICE(o->pci_dev));
35
ret = bdrv_co_block_status(bs, want_zero, offset, bytes, pnum, map, file);
33
+
36
if (ret < 0 || *pnum == 0 || ret & BDRV_BLOCK_ALLOCATED || bs == base) {
34
+ return 0;
37
return ret;
35
+}
36
+
37
/*
38
* TYPE_VFU_OBJECT depends on the availability of the 'socket' and 'device'
39
* properties. It also depends on devices instantiated in QEMU. These
40
@@ -XXX,XX +XXX,XX @@ static void vfu_object_init_ctx(VfuObject *o, Error **errp)
41
goto fail;
42
}
43
44
+ ret = vfu_setup_device_reset_cb(o->vfu_ctx, &vfu_object_device_reset);
45
+ if (ret < 0) {
46
+ error_setg(errp, "vfu: Failed to setup reset callback");
47
+ goto fail;
48
+ }
49
+
50
ret = vfu_realize_ctx(o->vfu_ctx);
51
if (ret < 0) {
52
error_setg(errp, "vfu: Failed to realize device %s- %s",
53
--
38
--
54
2.36.1
39
2.26.2
40
diff view generated by jsdifflib
1
From: Jagannathan Raman <jag.raman@oracle.com>
1
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2
2
3
Define and register handlers for PCI config space accesses
3
bdrv_is_allocated_above wrongly handles short backing files: it reports
4
after-EOF space as UNALLOCATED which is wrong, as on read the data is
5
generated on the level of short backing file (if all overlays have
6
unallocated areas at that place).
4
7
5
Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
8
Reusing bdrv_common_block_status_above fixes the issue and unifies code
6
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
9
path.
7
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
10
8
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
11
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
9
Message-id: be9d2ccf9b1d24e50dcd9c23404dbf284142cec7.1655151679.git.jag.raman@oracle.com
12
Reviewed-by: Eric Blake <eblake@redhat.com>
13
Reviewed-by: Alberto Garcia <berto@igalia.com>
14
Message-id: 20200924194003.22080-5-vsementsov@virtuozzo.com
15
[Fix s/has/have/ as suggested by Eric Blake. Fix s/area/areas/.
16
--Stefan]
10
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
17
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
11
---
18
---
12
hw/remote/vfio-user-obj.c | 51 +++++++++++++++++++++++++++++++++++++++
19
block/io.c | 43 +++++--------------------------------------
13
hw/remote/trace-events | 2 ++
20
1 file changed, 5 insertions(+), 38 deletions(-)
14
2 files changed, 53 insertions(+)
15
21
16
diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c
22
diff --git a/block/io.c b/block/io.c
17
index XXXXXXX..XXXXXXX 100644
23
index XXXXXXX..XXXXXXX 100644
18
--- a/hw/remote/vfio-user-obj.c
24
--- a/block/io.c
19
+++ b/hw/remote/vfio-user-obj.c
25
+++ b/block/io.c
20
@@ -XXX,XX +XXX,XX @@
26
@@ -XXX,XX +XXX,XX @@ int coroutine_fn bdrv_is_allocated(BlockDriverState *bs, int64_t offset,
21
#include "qapi/qapi-events-misc.h"
27
* at 'offset + *pnum' may return the same allocation status (in other
22
#include "qemu/notify.h"
28
* words, the result is not necessarily the maximum possible range);
23
#include "qemu/thread.h"
29
* but 'pnum' will only be 0 when end of file is reached.
24
+#include "qemu/main-loop.h"
30
- *
25
#include "sysemu/sysemu.h"
31
*/
26
#include "libvfio-user.h"
32
int bdrv_is_allocated_above(BlockDriverState *top,
27
#include "hw/qdev-core.h"
33
BlockDriverState *base,
28
@@ -XXX,XX +XXX,XX @@ retry_attach:
34
bool include_base, int64_t offset,
29
qemu_set_fd_handler(o->vfu_poll_fd, vfu_object_ctx_run, NULL, o);
35
int64_t bytes, int64_t *pnum)
36
{
37
- BlockDriverState *intermediate;
38
- int ret;
39
- int64_t n = bytes;
40
-
41
- assert(base || !include_base);
42
-
43
- intermediate = top;
44
- while (include_base || intermediate != base) {
45
- int64_t pnum_inter;
46
- int64_t size_inter;
47
-
48
- assert(intermediate);
49
- ret = bdrv_is_allocated(intermediate, offset, bytes, &pnum_inter);
50
- if (ret < 0) {
51
- return ret;
52
- }
53
- if (ret) {
54
- *pnum = pnum_inter;
55
- return 1;
56
- }
57
-
58
- size_inter = bdrv_getlength(intermediate);
59
- if (size_inter < 0) {
60
- return size_inter;
61
- }
62
- if (n > pnum_inter &&
63
- (intermediate == top || offset + pnum_inter < size_inter)) {
64
- n = pnum_inter;
65
- }
66
-
67
- if (intermediate == base) {
68
- break;
69
- }
70
-
71
- intermediate = bdrv_filter_or_cow_bs(intermediate);
72
+ int ret = bdrv_common_block_status_above(top, base, include_base, false,
73
+ offset, bytes, pnum, NULL, NULL);
74
+ if (ret < 0) {
75
+ return ret;
76
}
77
78
- *pnum = n;
79
- return 0;
80
+ return !!(ret & BDRV_BLOCK_ALLOCATED);
30
}
81
}
31
82
32
+static ssize_t vfu_object_cfg_access(vfu_ctx_t *vfu_ctx, char * const buf,
83
int coroutine_fn
33
+ size_t count, loff_t offset,
34
+ const bool is_write)
35
+{
36
+ VfuObject *o = vfu_get_private(vfu_ctx);
37
+ uint32_t pci_access_width = sizeof(uint32_t);
38
+ size_t bytes = count;
39
+ uint32_t val = 0;
40
+ char *ptr = buf;
41
+ int len;
42
+
43
+ /*
44
+ * Writes to the BAR registers would trigger an update to the
45
+ * global Memory and IO AddressSpaces. But the remote device
46
+ * never uses the global AddressSpaces, therefore overlapping
47
+ * memory regions are not a problem
48
+ */
49
+ while (bytes > 0) {
50
+ len = (bytes > pci_access_width) ? pci_access_width : bytes;
51
+ if (is_write) {
52
+ memcpy(&val, ptr, len);
53
+ pci_host_config_write_common(o->pci_dev, offset,
54
+ pci_config_size(o->pci_dev),
55
+ val, len);
56
+ trace_vfu_cfg_write(offset, val);
57
+ } else {
58
+ val = pci_host_config_read_common(o->pci_dev, offset,
59
+ pci_config_size(o->pci_dev), len);
60
+ memcpy(ptr, &val, len);
61
+ trace_vfu_cfg_read(offset, val);
62
+ }
63
+ offset += len;
64
+ ptr += len;
65
+ bytes -= len;
66
+ }
67
+
68
+ return count;
69
+}
70
+
71
/*
72
* TYPE_VFU_OBJECT depends on the availability of the 'socket' and 'device'
73
* properties. It also depends on devices instantiated in QEMU. These
74
@@ -XXX,XX +XXX,XX @@ static void vfu_object_init_ctx(VfuObject *o, Error **errp)
75
TYPE_VFU_OBJECT, o->device);
76
qdev_add_unplug_blocker(DEVICE(o->pci_dev), o->unplug_blocker);
77
78
+ ret = vfu_setup_region(o->vfu_ctx, VFU_PCI_DEV_CFG_REGION_IDX,
79
+ pci_config_size(o->pci_dev), &vfu_object_cfg_access,
80
+ VFU_REGION_FLAG_RW | VFU_REGION_FLAG_ALWAYS_CB,
81
+ NULL, 0, -1, 0);
82
+ if (ret < 0) {
83
+ error_setg(errp,
84
+ "vfu: Failed to setup config space handlers for %s- %s",
85
+ o->device, strerror(errno));
86
+ goto fail;
87
+ }
88
+
89
ret = vfu_realize_ctx(o->vfu_ctx);
90
if (ret < 0) {
91
error_setg(errp, "vfu: Failed to realize device %s- %s",
92
diff --git a/hw/remote/trace-events b/hw/remote/trace-events
93
index XXXXXXX..XXXXXXX 100644
94
--- a/hw/remote/trace-events
95
+++ b/hw/remote/trace-events
96
@@ -XXX,XX +XXX,XX @@ mpqemu_recv_io_error(int cmd, int size, int nfds) "failed to receive %d size %d,
97
98
# vfio-user-obj.c
99
vfu_prop(const char *prop, const char *val) "vfu: setting %s as %s"
100
+vfu_cfg_read(uint32_t offset, uint32_t val) "vfu: cfg: 0x%u -> 0x%x"
101
+vfu_cfg_write(uint32_t offset, uint32_t val) "vfu: cfg: 0x%u <- 0x%x"
102
--
84
--
103
2.36.1
85
2.26.2
86
diff view generated by jsdifflib
1
It may not be obvious why laio_io_unplug() checks max batch. I discussed
1
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2
this with Stefano and have added a comment summarizing the reason.
3
2
4
Cc: Stefano Garzarella <sgarzare@redhat.com>
3
These cases are fixed by previous patches around block_status and
5
Cc: Kevin Wolf <kwolf@redhat.com>
4
is_allocated.
6
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
5
7
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
6
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
8
Message-id: 20220609164712.1539045-3-stefanha@redhat.com
7
Reviewed-by: Eric Blake <eblake@redhat.com>
8
Reviewed-by: Alberto Garcia <berto@igalia.com>
9
Message-id: 20200924194003.22080-6-vsementsov@virtuozzo.com
9
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10
---
11
---
11
block/linux-aio.c | 6 ++++++
12
tests/qemu-iotests/274 | 20 +++++++++++
12
1 file changed, 6 insertions(+)
13
tests/qemu-iotests/274.out | 68 ++++++++++++++++++++++++++++++++++++++
14
2 files changed, 88 insertions(+)
13
15
14
diff --git a/block/linux-aio.c b/block/linux-aio.c
16
diff --git a/tests/qemu-iotests/274 b/tests/qemu-iotests/274
17
index XXXXXXX..XXXXXXX 100755
18
--- a/tests/qemu-iotests/274
19
+++ b/tests/qemu-iotests/274
20
@@ -XXX,XX +XXX,XX @@ with iotests.FilePath('base') as base, \
21
iotests.qemu_io_log('-c', 'read -P 1 0 %d' % size_short, mid)
22
iotests.qemu_io_log('-c', 'read -P 0 %d %d' % (size_short, size_diff), mid)
23
24
+ iotests.log('=== Testing qemu-img commit (top -> base) ===')
25
+
26
+ create_chain()
27
+ iotests.qemu_img_log('commit', '-b', base, top)
28
+ iotests.img_info_log(base)
29
+ iotests.qemu_io_log('-c', 'read -P 1 0 %d' % size_short, base)
30
+ iotests.qemu_io_log('-c', 'read -P 0 %d %d' % (size_short, size_diff), base)
31
+
32
+ iotests.log('=== Testing QMP active commit (top -> base) ===')
33
+
34
+ create_chain()
35
+ with create_vm() as vm:
36
+ vm.launch()
37
+ vm.qmp_log('block-commit', device='top', base_node='base',
38
+ job_id='job0', auto_dismiss=False)
39
+ vm.run_job('job0', wait=5)
40
+
41
+ iotests.img_info_log(mid)
42
+ iotests.qemu_io_log('-c', 'read -P 1 0 %d' % size_short, base)
43
+ iotests.qemu_io_log('-c', 'read -P 0 %d %d' % (size_short, size_diff), base)
44
45
iotests.log('== Resize tests ==')
46
47
diff --git a/tests/qemu-iotests/274.out b/tests/qemu-iotests/274.out
15
index XXXXXXX..XXXXXXX 100644
48
index XXXXXXX..XXXXXXX 100644
16
--- a/block/linux-aio.c
49
--- a/tests/qemu-iotests/274.out
17
+++ b/block/linux-aio.c
50
+++ b/tests/qemu-iotests/274.out
18
@@ -XXX,XX +XXX,XX @@ void laio_io_unplug(BlockDriverState *bs, LinuxAioState *s,
51
@@ -XXX,XX +XXX,XX @@ read 1048576/1048576 bytes at offset 0
19
assert(s->io_q.plugged);
52
read 1048576/1048576 bytes at offset 1048576
20
s->io_q.plugged--;
53
1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
21
54
22
+ /*
55
+=== Testing qemu-img commit (top -> base) ===
23
+ * Why max batch checking is performed here:
56
+Formatting 'TEST_DIR/PID-base', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=2097152 lazy_refcounts=off refcount_bits=16
24
+ * Another BDS may have queued requests with a higher dev_max_batch and
57
+
25
+ * therefore in_queue could now exceed our dev_max_batch. Re-check the max
58
+Formatting 'TEST_DIR/PID-mid', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=1048576 backing_file=TEST_DIR/PID-base backing_fmt=qcow2 lazy_refcounts=off refcount_bits=16
26
+ * batch so we can honor our device's dev_max_batch.
59
+
27
+ */
60
+Formatting 'TEST_DIR/PID-top', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=2097152 backing_file=TEST_DIR/PID-mid backing_fmt=qcow2 lazy_refcounts=off refcount_bits=16
28
if (s->io_q.in_queue >= laio_max_batch(s, dev_max_batch) ||
61
+
29
(!s->io_q.plugged &&
62
+wrote 2097152/2097152 bytes at offset 0
30
!s->io_q.blocked && !QSIMPLEQ_EMPTY(&s->io_q.pending))) {
63
+2 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
64
+
65
+Image committed.
66
+
67
+image: TEST_IMG
68
+file format: IMGFMT
69
+virtual size: 2 MiB (2097152 bytes)
70
+cluster_size: 65536
71
+Format specific information:
72
+ compat: 1.1
73
+ compression type: zlib
74
+ lazy refcounts: false
75
+ refcount bits: 16
76
+ corrupt: false
77
+ extended l2: false
78
+
79
+read 1048576/1048576 bytes at offset 0
80
+1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
81
+
82
+read 1048576/1048576 bytes at offset 1048576
83
+1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
84
+
85
+=== Testing QMP active commit (top -> base) ===
86
+Formatting 'TEST_DIR/PID-base', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=2097152 lazy_refcounts=off refcount_bits=16
87
+
88
+Formatting 'TEST_DIR/PID-mid', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=1048576 backing_file=TEST_DIR/PID-base backing_fmt=qcow2 lazy_refcounts=off refcount_bits=16
89
+
90
+Formatting 'TEST_DIR/PID-top', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=2097152 backing_file=TEST_DIR/PID-mid backing_fmt=qcow2 lazy_refcounts=off refcount_bits=16
91
+
92
+wrote 2097152/2097152 bytes at offset 0
93
+2 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
94
+
95
+{"execute": "block-commit", "arguments": {"auto-dismiss": false, "base-node": "base", "device": "top", "job-id": "job0"}}
96
+{"return": {}}
97
+{"execute": "job-complete", "arguments": {"id": "job0"}}
98
+{"return": {}}
99
+{"data": {"device": "job0", "len": 1048576, "offset": 1048576, "speed": 0, "type": "commit"}, "event": "BLOCK_JOB_READY", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
100
+{"data": {"device": "job0", "len": 1048576, "offset": 1048576, "speed": 0, "type": "commit"}, "event": "BLOCK_JOB_COMPLETED", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
101
+{"execute": "job-dismiss", "arguments": {"id": "job0"}}
102
+{"return": {}}
103
+image: TEST_IMG
104
+file format: IMGFMT
105
+virtual size: 1 MiB (1048576 bytes)
106
+cluster_size: 65536
107
+backing file: TEST_DIR/PID-base
108
+backing file format: IMGFMT
109
+Format specific information:
110
+ compat: 1.1
111
+ compression type: zlib
112
+ lazy refcounts: false
113
+ refcount bits: 16
114
+ corrupt: false
115
+ extended l2: false
116
+
117
+read 1048576/1048576 bytes at offset 0
118
+1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
119
+
120
+read 1048576/1048576 bytes at offset 1048576
121
+1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
122
+
123
== Resize tests ==
124
=== preallocation=off ===
125
Formatting 'TEST_DIR/PID-base', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=6442450944 lazy_refcounts=off refcount_bits=16
31
--
126
--
32
2.36.1
127
2.26.2
128
diff view generated by jsdifflib