1
The following changes since commit 6157b0e19721aadb4c7fdcfe57b2924af6144b14:
1
The following changes since commit d9ccf33f9479201e5add8db0af68ca9ca8da358b:
2
2
3
Merge remote-tracking branch 'remotes/vivier2/tags/linux-user-for-6.0-pull-request' into staging (2021-03-14 17:47:49 +0000)
3
Merge remote-tracking branch 'remotes/lvivier-gitlab/tags/linux-user-for-7.0-pull-request' into staging (2022-03-09 20:01:17 +0000)
4
4
5
are available in the git repository at:
5
are available in the git repository at:
6
6
7
https://github.com/jasowang/qemu.git tags/net-pull-request
7
https://github.com/jasowang/qemu.git tags/net-pull-request
8
8
9
for you to fetch changes up to f2e8319d456724c3d8514d943dc4607e2f08e88a:
9
for you to fetch changes up to eea40402ecf895ed345f8e8eb07dbb484f4542c5:
10
10
11
net: Do not fill legacy info_str for backends (2021-03-15 16:41:22 +0800)
11
vdpa: Expose VHOST_F_LOG_ALL on SVQ (2022-03-10 10:26:32 +0800)
12
12
13
----------------------------------------------------------------
13
----------------------------------------------------------------
14
14
15
Changes since V1:
15
----------------------------------------------------------------
16
- drop the workaound of "-nic" id and fix the merge
16
Eugenio Pérez (14):
17
- add the series of query-netdev
17
vhost: Add VhostShadowVirtqueue
18
vhost: Add Shadow VirtQueue kick forwarding capabilities
19
vhost: Add Shadow VirtQueue call forwarding capabilities
20
vhost: Add vhost_svq_valid_features to shadow vq
21
virtio: Add vhost_svq_get_vring_addr
22
vdpa: adapt vhost_ops callbacks to svq
23
vhost: Shadow virtqueue buffers forwarding
24
util: Add iova_tree_alloc_map
25
util: add iova_tree_find_iova
26
vhost: Add VhostIOVATree
27
vdpa: Add custom IOTLB translations to SVQ
28
vdpa: Adapt vhost_vdpa_get_vring_base to SVQ
29
vdpa: Never set log_base addr if SVQ is enabled
30
vdpa: Expose VHOST_F_LOG_ALL on SVQ
18
31
19
----------------------------------------------------------------
32
Jason Wang (1):
20
Alexander Bulekov (4):
33
virtio-net: fix map leaking on error during receive
21
rtl8139: switch to use qemu_receive_packet() for loopback
22
pcnet: switch to use qemu_receive_packet() for loopback
23
cadence_gem: switch to use qemu_receive_packet() for loopback
24
lan9118: switch to use qemu_receive_packet() for loopback
25
34
26
Alexey Kirillov (5):
35
hw/net/virtio-net.c | 1 +
27
qapi: net: Add query-netdev command
36
hw/virtio/meson.build | 2 +-
28
tests: Add tests for query-netdev command
37
hw/virtio/vhost-iova-tree.c | 110 +++++++
29
net: Move NetClientState.info_str to dynamic allocations
38
hw/virtio/vhost-iova-tree.h | 27 ++
30
hmp: Use QAPI NetdevInfo in hmp_info_network
39
hw/virtio/vhost-shadow-virtqueue.c | 638 +++++++++++++++++++++++++++++++++++++
31
net: Do not fill legacy info_str for backends
40
hw/virtio/vhost-shadow-virtqueue.h | 87 +++++
32
41
hw/virtio/vhost-vdpa.c | 525 +++++++++++++++++++++++++++++-
33
Bin Meng (1):
42
include/hw/virtio/vhost-vdpa.h | 8 +
34
net: Fix build error when DEBUG_NET is on
43
include/qemu/iova-tree.h | 38 ++-
35
44
util/iova-tree.c | 169 ++++++++++
36
Cornelia Huck (1):
45
10 files changed, 1588 insertions(+), 17 deletions(-)
37
pvrdma: wean code off pvrdma_ring.h kernel header
46
create mode 100644 hw/virtio/vhost-iova-tree.c
38
47
create mode 100644 hw/virtio/vhost-iova-tree.h
39
Jason Wang (8):
48
create mode 100644 hw/virtio/vhost-shadow-virtqueue.c
40
virtio-net: calculating proper msix vectors on init
49
create mode 100644 hw/virtio/vhost-shadow-virtqueue.h
41
e1000: fail early for evil descriptor
42
net: introduce qemu_receive_packet()
43
e1000: switch to use qemu_receive_packet() for loopback
44
dp8393x: switch to use qemu_receive_packet() for loopback packet
45
msf2-mac: switch to use qemu_receive_packet() for loopback
46
sungem: switch to use qemu_receive_packet() for loopback
47
tx_pkt: switch to use qemu_receive_packet_iov() for loopback
48
49
Paolo Bonzini (1):
50
net: validate that ids are well formed
51
52
hw/core/machine.c | 1 +
53
hw/net/cadence_gem.c | 4 +-
54
hw/net/dp8393x.c | 2 +-
55
hw/net/e1000.c | 6 +-
56
hw/net/lan9118.c | 2 +-
57
hw/net/msf2-emac.c | 2 +-
58
hw/net/net_tx_pkt.c | 2 +-
59
hw/net/pcnet.c | 2 +-
60
hw/net/rtl8139.c | 2 +-
61
hw/net/sungem.c | 2 +-
62
hw/net/xen_nic.c | 5 +-
63
hw/rdma/vmw/pvrdma.h | 5 +-
64
hw/rdma/vmw/pvrdma_cmd.c | 6 +-
65
hw/rdma/vmw/pvrdma_dev_ring.c | 41 +++--
66
hw/rdma/vmw/pvrdma_dev_ring.h | 9 +-
67
hw/rdma/vmw/pvrdma_main.c | 4 +-
68
hw/virtio/virtio-net-pci.c | 10 +-
69
include/net/net.h | 10 +-
70
include/net/queue.h | 8 +
71
include/qapi/hmp-output-visitor.h | 30 ++++
72
.../drivers/infiniband/hw/vmw_pvrdma/pvrdma_ring.h | 114 ------------
73
net/l2tpv3.c | 8 +-
74
net/net.c | 117 +++++++++++--
75
net/netmap.c | 7 +
76
net/queue.c | 22 +++
77
net/slirp.c | 124 ++++++++++++-
78
net/socket.c | 92 +++++++---
79
net/tap-win32.c | 10 +-
80
net/tap.c | 107 ++++++++++--
81
net/vde.c | 25 ++-
82
net/vhost-user.c | 20 ++-
83
net/vhost-vdpa.c | 15 +-
84
qapi/hmp-output-visitor.c | 193 +++++++++++++++++++++
85
qapi/meson.build | 1 +
86
qapi/net.json | 80 +++++++++
87
scripts/update-linux-headers.sh | 3 +-
88
tests/qtest/meson.build | 3 +
89
tests/qtest/test-query-netdev.c | 120 +++++++++++++
90
38 files changed, 990 insertions(+), 224 deletions(-)
91
create mode 100644 include/qapi/hmp-output-visitor.h
92
delete mode 100644 include/standard-headers/drivers/infiniband/hw/vmw_pvrdma/pvrdma_ring.h
93
create mode 100644 qapi/hmp-output-visitor.c
94
create mode 100644 tests/qtest/test-query-netdev.c
95
50
96
51
97
diff view generated by jsdifflib
Deleted patch
1
Currently, the default msix vectors for virtio-net-pci is 3 which is
2
obvious not suitable for multiqueue guest, so we depends on the user
3
or management tools to pass a correct vectors parameter. In fact, we
4
can simplifying this by calculating the number of vectors on realize.
5
1
6
Consider we have N queues, the number of vectors needed is 2*N + 2
7
(#queue pairs + plus one config interrupt and control vq). We didn't
8
check whether or not host support control vq because it was added
9
unconditionally by qemu to avoid breaking legacy guests such as Minix.
10
11
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com
12
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
13
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
14
Signed-off-by: Jason Wang <jasowang@redhat.com>
15
---
16
hw/core/machine.c | 1 +
17
hw/virtio/virtio-net-pci.c | 10 +++++++++-
18
2 files changed, 10 insertions(+), 1 deletion(-)
19
20
diff --git a/hw/core/machine.c b/hw/core/machine.c
21
index XXXXXXX..XXXXXXX 100644
22
--- a/hw/core/machine.c
23
+++ b/hw/core/machine.c
24
@@ -XXX,XX +XXX,XX @@
25
GlobalProperty hw_compat_5_2[] = {
26
{ "ICH9-LPC", "smm-compat", "on"},
27
{ "PIIX4_PM", "smm-compat", "on"},
28
+ { "virtio-net-pci", "vectors", "3"},
29
};
30
const size_t hw_compat_5_2_len = G_N_ELEMENTS(hw_compat_5_2);
31
32
diff --git a/hw/virtio/virtio-net-pci.c b/hw/virtio/virtio-net-pci.c
33
index XXXXXXX..XXXXXXX 100644
34
--- a/hw/virtio/virtio-net-pci.c
35
+++ b/hw/virtio/virtio-net-pci.c
36
@@ -XXX,XX +XXX,XX @@ struct VirtIONetPCI {
37
static Property virtio_net_properties[] = {
38
DEFINE_PROP_BIT("ioeventfd", VirtIOPCIProxy, flags,
39
VIRTIO_PCI_FLAG_USE_IOEVENTFD_BIT, true),
40
- DEFINE_PROP_UINT32("vectors", VirtIOPCIProxy, nvectors, 3),
41
+ DEFINE_PROP_UINT32("vectors", VirtIOPCIProxy, nvectors,
42
+ DEV_NVECTORS_UNSPECIFIED),
43
DEFINE_PROP_END_OF_LIST(),
44
};
45
46
@@ -XXX,XX +XXX,XX @@ static void virtio_net_pci_realize(VirtIOPCIProxy *vpci_dev, Error **errp)
47
DeviceState *qdev = DEVICE(vpci_dev);
48
VirtIONetPCI *dev = VIRTIO_NET_PCI(vpci_dev);
49
DeviceState *vdev = DEVICE(&dev->vdev);
50
+ VirtIONet *net = VIRTIO_NET(vdev);
51
+
52
+ if (vpci_dev->nvectors == DEV_NVECTORS_UNSPECIFIED) {
53
+ vpci_dev->nvectors = 2 * MAX(net->nic_conf.peers.queues, 1)
54
+ + 1 /* Config interrupt */
55
+ + 1 /* Control vq */;
56
+ }
57
58
virtio_net_set_netclient_name(&dev->vdev, qdev->id,
59
object_get_typename(OBJECT(qdev)));
60
--
61
2.7.4
62
63
diff view generated by jsdifflib
1
From: Bin Meng <bin.meng@windriver.com>
1
Commit bedd7e93d0196 ("virtio-net: fix use after unmap/free for sg")
2
tries to fix the use after free of the sg by caching the virtqueue
3
elements in an array and unmap them at once after receiving the
4
packets, But it forgot to unmap the cached elements on error which
5
will lead to leaking of mapping and other unexpected results.
2
6
3
"qemu-common.h" should be included to provide the forward declaration
7
Fixing this by detaching the cached elements on error. This addresses
4
of qemu_hexdump() when DEBUG_NET is on.
8
CVE-2022-26353.
5
9
6
Signed-off-by: Bin Meng <bin.meng@windriver.com>
10
Reported-by: Victor Tom <vv474172261@gmail.com>
7
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
11
Cc: qemu-stable@nongnu.org
12
Fixes: CVE-2022-26353
13
Fixes: bedd7e93d0196 ("virtio-net: fix use after unmap/free for sg")
14
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
8
Signed-off-by: Jason Wang <jasowang@redhat.com>
15
Signed-off-by: Jason Wang <jasowang@redhat.com>
9
---
16
---
10
net/net.c | 1 +
17
hw/net/virtio-net.c | 1 +
11
1 file changed, 1 insertion(+)
18
1 file changed, 1 insertion(+)
12
19
13
diff --git a/net/net.c b/net/net.c
20
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
14
index XXXXXXX..XXXXXXX 100644
21
index XXXXXXX..XXXXXXX 100644
15
--- a/net/net.c
22
--- a/hw/net/virtio-net.c
16
+++ b/net/net.c
23
+++ b/hw/net/virtio-net.c
17
@@ -XXX,XX +XXX,XX @@
24
@@ -XXX,XX +XXX,XX @@ static ssize_t virtio_net_receive_rcu(NetClientState *nc, const uint8_t *buf,
18
*/
25
19
26
err:
20
#include "qemu/osdep.h"
27
for (j = 0; j < i; j++) {
21
+#include "qemu-common.h"
28
+ virtqueue_detach_element(q->rx_vq, elems[j], lens[j]);
22
29
g_free(elems[j]);
23
#include "net/net.h"
30
}
24
#include "clients.h"
31
25
--
32
--
26
2.7.4
33
2.7.4
27
28
diff view generated by jsdifflib
1
From: Alexey Kirillov <lekiravi@yandex-team.ru>
1
From: Eugenio Pérez <eperezma@redhat.com>
2
2
3
Replace usage of legacy field info_str of NetClientState for backend
3
Vhost shadow virtqueue (SVQ) is an intermediate jump for virtqueue
4
network devices with QAPI NetdevInfo stored_config that already used
4
notifications and buffers, allowing qemu to track them. While qemu is
5
in QMP query-netdev.
5
forwarding the buffers and virtqueue changes, it is able to commit the
6
memory it's being dirtied, the same way regular qemu's VirtIO devices
7
do.
6
8
7
This change increases the detail of the "info network" output and takes
9
This commit only exposes basic SVQ allocation and free. Next patches of
8
a more general approach to composing the output.
10
the series add functionality like notifications and buffers forwarding.
9
11
10
NIC and hubports still use legacy info_str field.
12
Acked-by: Michael S. Tsirkin <mst@redhat.com>
11
13
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
12
Signed-off-by: Alexey Kirillov <lekiravi@yandex-team.ru>
13
Signed-off-by: Jason Wang <jasowang@redhat.com>
14
Signed-off-by: Jason Wang <jasowang@redhat.com>
14
---
15
---
15
include/qapi/hmp-output-visitor.h | 30 ++++++
16
hw/virtio/meson.build | 2 +-
16
net/net.c | 31 +++++-
17
hw/virtio/vhost-shadow-virtqueue.c | 62 ++++++++++++++++++++++++++++++++++++++
17
qapi/hmp-output-visitor.c | 193 ++++++++++++++++++++++++++++++++++++++
18
hw/virtio/vhost-shadow-virtqueue.h | 28 +++++++++++++++++
18
qapi/meson.build | 1 +
19
3 files changed, 91 insertions(+), 1 deletion(-)
19
4 files changed, 254 insertions(+), 1 deletion(-)
20
create mode 100644 hw/virtio/vhost-shadow-virtqueue.c
20
create mode 100644 include/qapi/hmp-output-visitor.h
21
create mode 100644 hw/virtio/vhost-shadow-virtqueue.h
21
create mode 100644 qapi/hmp-output-visitor.c
22
22
23
diff --git a/include/qapi/hmp-output-visitor.h b/include/qapi/hmp-output-visitor.h
23
diff --git a/hw/virtio/meson.build b/hw/virtio/meson.build
24
index XXXXXXX..XXXXXXX 100644
25
--- a/hw/virtio/meson.build
26
+++ b/hw/virtio/meson.build
27
@@ -XXX,XX +XXX,XX @@ softmmu_ss.add(when: 'CONFIG_ALL', if_true: files('vhost-stub.c'))
28
29
virtio_ss = ss.source_set()
30
virtio_ss.add(files('virtio.c'))
31
-virtio_ss.add(when: 'CONFIG_VHOST', if_true: files('vhost.c', 'vhost-backend.c'))
32
+virtio_ss.add(when: 'CONFIG_VHOST', if_true: files('vhost.c', 'vhost-backend.c', 'vhost-shadow-virtqueue.c'))
33
virtio_ss.add(when: 'CONFIG_VHOST_USER', if_true: files('vhost-user.c'))
34
virtio_ss.add(when: 'CONFIG_VHOST_VDPA', if_true: files('vhost-vdpa.c'))
35
virtio_ss.add(when: 'CONFIG_VIRTIO_BALLOON', if_true: files('virtio-balloon.c'))
36
diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
24
new file mode 100644
37
new file mode 100644
25
index XXXXXXX..XXXXXXX
38
index XXXXXXX..XXXXXXX
26
--- /dev/null
39
--- /dev/null
27
+++ b/include/qapi/hmp-output-visitor.h
40
+++ b/hw/virtio/vhost-shadow-virtqueue.c
28
@@ -XXX,XX +XXX,XX @@
41
@@ -XXX,XX +XXX,XX @@
29
+/*
42
+/*
30
+ * HMP string output Visitor
43
+ * vhost shadow virtqueue
31
+ *
44
+ *
32
+ * Copyright Yandex N.V., 2021
45
+ * SPDX-FileCopyrightText: Red Hat, Inc. 2021
46
+ * SPDX-FileContributor: Author: Eugenio Pérez <eperezma@redhat.com>
33
+ *
47
+ *
34
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
48
+ * SPDX-License-Identifier: GPL-2.0-or-later
35
+ * See the COPYING file in the top-level directory.
36
+ *
37
+ */
49
+ */
38
+
50
+
39
+#ifndef HMP_OUTPUT_VISITOR_H
51
+#include "qemu/osdep.h"
40
+#define HMP_OUTPUT_VISITOR_H
52
+#include "hw/virtio/vhost-shadow-virtqueue.h"
41
+
53
+
42
+#include "qapi/visitor.h"
54
+#include "qemu/error-report.h"
43
+
44
+typedef struct HMPOutputVisitor HMPOutputVisitor;
45
+
55
+
46
+/**
56
+/**
47
+ * Create a HMP string output visitor for @obj
57
+ * Creates vhost shadow virtqueue, and instructs the vhost device to use the
58
+ * shadow methods and file descriptors.
48
+ *
59
+ *
49
+ * Flattens dicts/structures, only shows arrays borders.
60
+ * Returns the new virtqueue or NULL.
50
+ *
61
+ *
51
+ * Errors are not expected to happen.
62
+ * In case of error, reason is reported through error_report.
52
+ *
53
+ * The caller is responsible for freeing the visitor with
54
+ * visit_free().
55
+ */
63
+ */
56
+Visitor *hmp_output_visitor_new(char **result);
64
+VhostShadowVirtqueue *vhost_svq_new(void)
65
+{
66
+ g_autofree VhostShadowVirtqueue *svq = g_new0(VhostShadowVirtqueue, 1);
67
+ int r;
57
+
68
+
58
+#endif
69
+ r = event_notifier_init(&svq->hdev_kick, 0);
59
diff --git a/net/net.c b/net/net.c
70
+ if (r != 0) {
60
index XXXXXXX..XXXXXXX 100644
71
+ error_report("Couldn't create kick event notifier: %s (%d)",
61
--- a/net/net.c
72
+ g_strerror(errno), errno);
62
+++ b/net/net.c
73
+ goto err_init_hdev_kick;
63
@@ -XXX,XX +XXX,XX @@
64
#include "sysemu/sysemu.h"
65
#include "net/filter.h"
66
#include "qapi/string-output-visitor.h"
67
+#include "qapi/hmp-output-visitor.h"
68
69
/* Net bridge is currently not supported for W32. */
70
#if !defined(_WIN32)
71
@@ -XXX,XX +XXX,XX @@ static void netfilter_print_info(Monitor *mon, NetFilterState *nf)
72
monitor_printf(mon, "\n");
73
}
74
75
+static char *generate_info_str(NetClientState *nc)
76
+{
77
+ NetdevInfo *ni = nc->stored_config;
78
+ char *ret_out = NULL;
79
+ Visitor *v;
80
+
81
+ /* Use legacy field info_str for NIC and hubports */
82
+ if ((nc->info->type == NET_CLIENT_DRIVER_NIC) ||
83
+ (nc->info->type == NET_CLIENT_DRIVER_HUBPORT)) {
84
+ return g_strdup(nc->info_str ? nc->info_str : "");
85
+ }
74
+ }
86
+
75
+
87
+ if (!ni) {
76
+ r = event_notifier_init(&svq->hdev_call, 0);
88
+ return g_malloc0(1);
77
+ if (r != 0) {
78
+ error_report("Couldn't create call event notifier: %s (%d)",
79
+ g_strerror(errno), errno);
80
+ goto err_init_hdev_call;
89
+ }
81
+ }
90
+
82
+
91
+ v = hmp_output_visitor_new(&ret_out);
83
+ return g_steal_pointer(&svq);
92
+ if (visit_type_NetdevInfo(v, "", &ni, NULL)) {
93
+ visit_complete(v, &ret_out);
94
+ }
95
+ visit_free(v);
96
+
84
+
97
+ return ret_out;
85
+err_init_hdev_call:
86
+ event_notifier_cleanup(&svq->hdev_kick);
87
+
88
+err_init_hdev_kick:
89
+ return NULL;
98
+}
90
+}
99
+
91
+
100
void print_net_client(Monitor *mon, NetClientState *nc)
92
+/**
101
{
93
+ * Free the resources of the shadow virtqueue.
102
NetFilterState *nf;
94
+ *
103
+ char *info_str = generate_info_str(nc);
95
+ * @pvq: gpointer to SVQ so it can be used by autofree functions.
104
96
+ */
105
monitor_printf(mon, "%s: index=%d,type=%s,%s\n", nc->name,
97
+void vhost_svq_free(gpointer pvq)
106
nc->queue_index,
98
+{
107
NetClientDriver_str(nc->info->type),
99
+ VhostShadowVirtqueue *vq = pvq;
108
- nc->info_str ? nc->info_str : "");
100
+ event_notifier_cleanup(&vq->hdev_kick);
109
+ info_str);
101
+ event_notifier_cleanup(&vq->hdev_call);
110
+ g_free(info_str);
102
+ g_free(vq);
111
+
103
+}
112
if (!QTAILQ_EMPTY(&nc->filters)) {
104
diff --git a/hw/virtio/vhost-shadow-virtqueue.h b/hw/virtio/vhost-shadow-virtqueue.h
113
monitor_printf(mon, "filters:\n");
114
}
115
diff --git a/qapi/hmp-output-visitor.c b/qapi/hmp-output-visitor.c
116
new file mode 100644
105
new file mode 100644
117
index XXXXXXX..XXXXXXX
106
index XXXXXXX..XXXXXXX
118
--- /dev/null
107
--- /dev/null
119
+++ b/qapi/hmp-output-visitor.c
108
+++ b/hw/virtio/vhost-shadow-virtqueue.h
120
@@ -XXX,XX +XXX,XX @@
109
@@ -XXX,XX +XXX,XX @@
121
+/*
110
+/*
122
+ * HMP string output Visitor
111
+ * vhost shadow virtqueue
123
+ *
112
+ *
124
+ * Copyright Yandex N.V., 2021
113
+ * SPDX-FileCopyrightText: Red Hat, Inc. 2021
114
+ * SPDX-FileContributor: Author: Eugenio Pérez <eperezma@redhat.com>
125
+ *
115
+ *
126
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
116
+ * SPDX-License-Identifier: GPL-2.0-or-later
127
+ * See the COPYING file in the top-level directory.
128
+ *
129
+ */
117
+ */
130
+
118
+
131
+#include "qemu/osdep.h"
119
+#ifndef VHOST_SHADOW_VIRTQUEUE_H
132
+#include "qemu/cutils.h"
120
+#define VHOST_SHADOW_VIRTQUEUE_H
133
+#include "qapi/hmp-output-visitor.h"
134
+#include "qapi/visitor-impl.h"
135
+
121
+
136
+struct HMPOutputVisitor {
122
+#include "qemu/event_notifier.h"
137
+ Visitor visitor;
138
+ char **result;
139
+ GString *buffer;
140
+ bool is_continue;
141
+};
142
+
123
+
143
+static HMPOutputVisitor *to_hov(Visitor *v)
124
+/* Shadow virtqueue to relay notifications */
144
+{
125
+typedef struct VhostShadowVirtqueue {
145
+ return container_of(v, HMPOutputVisitor, visitor);
126
+ /* Shadow kick notifier, sent to vhost */
146
+}
127
+ EventNotifier hdev_kick;
128
+ /* Shadow call notifier, sent to vhost */
129
+ EventNotifier hdev_call;
130
+} VhostShadowVirtqueue;
147
+
131
+
148
+static void hmp_output_append_formatted(Visitor *v, const char *fmt, ...)
132
+VhostShadowVirtqueue *vhost_svq_new(void);
149
+{
150
+ HMPOutputVisitor *ov = to_hov(v);
151
+ va_list args;
152
+
133
+
153
+ if (ov->is_continue) {
134
+void vhost_svq_free(gpointer vq);
154
+ g_string_append(ov->buffer, ",");
135
+G_DEFINE_AUTOPTR_CLEANUP_FUNC(VhostShadowVirtqueue, vhost_svq_free);
155
+ } else {
156
+ ov->is_continue = true;
157
+ }
158
+
136
+
159
+ va_start(args, fmt);
137
+#endif
160
+ g_string_append_vprintf(ov->buffer, fmt, args);
161
+ va_end(args);
162
+}
163
+
164
+static void hmp_output_skip_comma(Visitor *v)
165
+{
166
+ HMPOutputVisitor *ov = to_hov(v);
167
+
168
+ ov->is_continue = false;
169
+}
170
+
171
+static bool hmp_output_start_struct(Visitor *v, const char *name,
172
+ void **obj, size_t unused, Error **errp)
173
+{
174
+ return true;
175
+}
176
+
177
+static void hmp_output_end_struct(Visitor *v, void **obj) {}
178
+
179
+static bool hmp_output_start_list(Visitor *v, const char *name,
180
+ GenericList **listp, size_t size,
181
+ Error **errp)
182
+{
183
+ hmp_output_append_formatted(v, "%s=[", name);
184
+ /* First element in array without comma before it */
185
+ hmp_output_skip_comma(v);
186
+
187
+ return true;
188
+}
189
+
190
+static GenericList *hmp_output_next_list(Visitor *v, GenericList *tail,
191
+ size_t size)
192
+{
193
+ return tail->next;
194
+}
195
+
196
+static void hmp_output_end_list(Visitor *v, void **obj)
197
+{
198
+ /* Don't need comma after last array element */
199
+ hmp_output_skip_comma(v);
200
+ hmp_output_append_formatted(v, "]");
201
+}
202
+
203
+static bool hmp_output_type_int64(Visitor *v, const char *name,
204
+ int64_t *obj, Error **errp)
205
+{
206
+ hmp_output_append_formatted(v, "%s=%" PRId64, name, *obj);
207
+
208
+ return true;
209
+}
210
+
211
+static bool hmp_output_type_uint64(Visitor *v, const char *name,
212
+ uint64_t *obj, Error **errp)
213
+{
214
+ hmp_output_append_formatted(v, "%s=%" PRIu64, name, *obj);
215
+
216
+ return true;
217
+}
218
+
219
+static bool hmp_output_type_bool(Visitor *v, const char *name, bool *obj,
220
+ Error **errp)
221
+{
222
+ hmp_output_append_formatted(v, "%s=%s", name, *obj ? "true" : "false");
223
+
224
+ return true;
225
+}
226
+
227
+static bool hmp_output_type_str(Visitor *v, const char *name, char **obj,
228
+ Error **errp)
229
+{
230
+ /* Skip already printed or unused fields */
231
+ if (!*obj || g_str_equal(name, "id") || g_str_equal(name, "type")) {
232
+ return true;
233
+ }
234
+
235
+ /* Do not print stub name for StringList elements */
236
+ if (g_str_equal(name, "str")) {
237
+ hmp_output_append_formatted(v, "%s", *obj);
238
+ } else {
239
+ hmp_output_append_formatted(v, "%s=%s", name, *obj);
240
+ }
241
+
242
+ return true;
243
+}
244
+
245
+static bool hmp_output_type_number(Visitor *v, const char *name,
246
+ double *obj, Error **errp)
247
+{
248
+ hmp_output_append_formatted(v, "%s=%.17g", name, *obj);
249
+
250
+ return true;
251
+}
252
+
253
+/* TODO: remove this function? */
254
+static bool hmp_output_type_any(Visitor *v, const char *name,
255
+ QObject **obj, Error **errp)
256
+{
257
+ return true;
258
+}
259
+
260
+static bool hmp_output_type_null(Visitor *v, const char *name,
261
+ QNull **obj, Error **errp)
262
+{
263
+ hmp_output_append_formatted(v, "%s=NULL", name);
264
+
265
+ return true;
266
+}
267
+
268
+static void hmp_output_complete(Visitor *v, void *opaque)
269
+{
270
+ HMPOutputVisitor *ov = to_hov(v);
271
+
272
+ *ov->result = g_string_free(ov->buffer, false);
273
+ ov->buffer = NULL;
274
+}
275
+
276
+static void hmp_output_free(Visitor *v)
277
+{
278
+ HMPOutputVisitor *ov = to_hov(v);
279
+
280
+ if (ov->buffer) {
281
+ g_string_free(ov->buffer, true);
282
+ }
283
+ g_free(v);
284
+}
285
+
286
+Visitor *hmp_output_visitor_new(char **result)
287
+{
288
+ HMPOutputVisitor *v;
289
+
290
+ v = g_malloc0(sizeof(*v));
291
+
292
+ v->visitor.type = VISITOR_OUTPUT;
293
+ v->visitor.start_struct = hmp_output_start_struct;
294
+ v->visitor.end_struct = hmp_output_end_struct;
295
+ v->visitor.start_list = hmp_output_start_list;
296
+ v->visitor.next_list = hmp_output_next_list;
297
+ v->visitor.end_list = hmp_output_end_list;
298
+ v->visitor.type_int64 = hmp_output_type_int64;
299
+ v->visitor.type_uint64 = hmp_output_type_uint64;
300
+ v->visitor.type_bool = hmp_output_type_bool;
301
+ v->visitor.type_str = hmp_output_type_str;
302
+ v->visitor.type_number = hmp_output_type_number;
303
+ v->visitor.type_any = hmp_output_type_any;
304
+ v->visitor.type_null = hmp_output_type_null;
305
+ v->visitor.complete = hmp_output_complete;
306
+ v->visitor.free = hmp_output_free;
307
+
308
+ v->result = result;
309
+ v->buffer = g_string_new("");
310
+ v->is_continue = false;
311
+
312
+ return &v->visitor;
313
+}
314
diff --git a/qapi/meson.build b/qapi/meson.build
315
index XXXXXXX..XXXXXXX 100644
316
--- a/qapi/meson.build
317
+++ b/qapi/meson.build
318
@@ -XXX,XX +XXX,XX @@ util_ss.add(files(
319
'qobject-output-visitor.c',
320
'string-input-visitor.c',
321
'string-output-visitor.c',
322
+ 'hmp-output-visitor.c',
323
))
324
if have_system or have_tools
325
util_ss.add(files(
326
--
138
--
327
2.7.4
139
2.7.4
328
140
329
141
diff view generated by jsdifflib
1
From: Alexander Bulekov <alxndr@bu.edu>
1
From: Eugenio Pérez <eperezma@redhat.com>
2
2
3
This patch switches to use qemu_receive_packet() which can detect
3
At this mode no buffer forwarding will be performed in SVQ mode: Qemu
4
reentrancy and return early.
4
will just forward the guest's kicks to the device.
5
5
6
This is intended to address CVE-2021-3416.
6
Host memory notifiers regions are left out for simplicity, and they will
7
7
not be addressed in this series.
8
Cc: Prasad J Pandit <ppandit@redhat.com>
8
9
Cc: qemu-stable@nongnu.org
9
Acked-by: Michael S. Tsirkin <mst@redhat.com>
10
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com
10
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
11
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
12
Signed-off-by: Jason Wang <jasowang@redhat.com>
11
Signed-off-by: Jason Wang <jasowang@redhat.com>
13
---
12
---
14
hw/net/lan9118.c | 2 +-
13
hw/virtio/vhost-shadow-virtqueue.c | 56 ++++++++++++++
15
1 file changed, 1 insertion(+), 1 deletion(-)
14
hw/virtio/vhost-shadow-virtqueue.h | 14 ++++
16
15
hw/virtio/vhost-vdpa.c | 145 ++++++++++++++++++++++++++++++++++++-
17
diff --git a/hw/net/lan9118.c b/hw/net/lan9118.c
16
include/hw/virtio/vhost-vdpa.h | 4 +
17
4 files changed, 217 insertions(+), 2 deletions(-)
18
19
diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
18
index XXXXXXX..XXXXXXX 100644
20
index XXXXXXX..XXXXXXX 100644
19
--- a/hw/net/lan9118.c
21
--- a/hw/virtio/vhost-shadow-virtqueue.c
20
+++ b/hw/net/lan9118.c
22
+++ b/hw/virtio/vhost-shadow-virtqueue.c
21
@@ -XXX,XX +XXX,XX @@ static void do_tx_packet(lan9118_state *s)
23
@@ -XXX,XX +XXX,XX @@
22
/* FIXME: Honor TX disable, and allow queueing of packets. */
24
#include "hw/virtio/vhost-shadow-virtqueue.h"
23
if (s->phy_control & 0x4000) {
25
24
/* This assumes the receive routine doesn't touch the VLANClient. */
26
#include "qemu/error-report.h"
25
- lan9118_receive(qemu_get_queue(s->nic), s->txp->data, s->txp->len);
27
+#include "qemu/main-loop.h"
26
+ qemu_receive_packet(qemu_get_queue(s->nic), s->txp->data, s->txp->len);
28
+#include "linux-headers/linux/vhost.h"
29
+
30
+/**
31
+ * Forward guest notifications.
32
+ *
33
+ * @n: guest kick event notifier, the one that guest set to notify svq.
34
+ */
35
+static void vhost_handle_guest_kick(EventNotifier *n)
36
+{
37
+ VhostShadowVirtqueue *svq = container_of(n, VhostShadowVirtqueue,
38
+ svq_kick);
39
+ event_notifier_test_and_clear(n);
40
+ event_notifier_set(&svq->hdev_kick);
41
+}
42
+
43
+/**
44
+ * Set a new file descriptor for the guest to kick the SVQ and notify for avail
45
+ *
46
+ * @svq: The svq
47
+ * @svq_kick_fd: The svq kick fd
48
+ *
49
+ * Note that the SVQ will never close the old file descriptor.
50
+ */
51
+void vhost_svq_set_svq_kick_fd(VhostShadowVirtqueue *svq, int svq_kick_fd)
52
+{
53
+ EventNotifier *svq_kick = &svq->svq_kick;
54
+ bool poll_stop = VHOST_FILE_UNBIND != event_notifier_get_fd(svq_kick);
55
+ bool poll_start = svq_kick_fd != VHOST_FILE_UNBIND;
56
+
57
+ if (poll_stop) {
58
+ event_notifier_set_handler(svq_kick, NULL);
59
+ }
60
+
61
+ /*
62
+ * event_notifier_set_handler already checks for guest's notifications if
63
+ * they arrive at the new file descriptor in the switch, so there is no
64
+ * need to explicitly check for them.
65
+ */
66
+ if (poll_start) {
67
+ event_notifier_init_fd(svq_kick, svq_kick_fd);
68
+ event_notifier_set(svq_kick);
69
+ event_notifier_set_handler(svq_kick, vhost_handle_guest_kick);
70
+ }
71
+}
72
+
73
+/**
74
+ * Stop the shadow virtqueue operation.
75
+ * @svq: Shadow Virtqueue
76
+ */
77
+void vhost_svq_stop(VhostShadowVirtqueue *svq)
78
+{
79
+ event_notifier_set_handler(&svq->svq_kick, NULL);
80
+}
81
82
/**
83
* Creates vhost shadow virtqueue, and instructs the vhost device to use the
84
@@ -XXX,XX +XXX,XX @@ VhostShadowVirtqueue *vhost_svq_new(void)
85
goto err_init_hdev_call;
86
}
87
88
+ event_notifier_init_fd(&svq->svq_kick, VHOST_FILE_UNBIND);
89
return g_steal_pointer(&svq);
90
91
err_init_hdev_call:
92
@@ -XXX,XX +XXX,XX @@ err_init_hdev_kick:
93
void vhost_svq_free(gpointer pvq)
94
{
95
VhostShadowVirtqueue *vq = pvq;
96
+ vhost_svq_stop(vq);
97
event_notifier_cleanup(&vq->hdev_kick);
98
event_notifier_cleanup(&vq->hdev_call);
99
g_free(vq);
100
diff --git a/hw/virtio/vhost-shadow-virtqueue.h b/hw/virtio/vhost-shadow-virtqueue.h
101
index XXXXXXX..XXXXXXX 100644
102
--- a/hw/virtio/vhost-shadow-virtqueue.h
103
+++ b/hw/virtio/vhost-shadow-virtqueue.h
104
@@ -XXX,XX +XXX,XX @@ typedef struct VhostShadowVirtqueue {
105
EventNotifier hdev_kick;
106
/* Shadow call notifier, sent to vhost */
107
EventNotifier hdev_call;
108
+
109
+ /*
110
+ * Borrowed virtqueue's guest to host notifier. To borrow it in this event
111
+ * notifier allows to recover the VhostShadowVirtqueue from the event loop
112
+ * easily. If we use the VirtQueue's one, we don't have an easy way to
113
+ * retrieve VhostShadowVirtqueue.
114
+ *
115
+ * So shadow virtqueue must not clean it, or we would lose VirtQueue one.
116
+ */
117
+ EventNotifier svq_kick;
118
} VhostShadowVirtqueue;
119
120
+void vhost_svq_set_svq_kick_fd(VhostShadowVirtqueue *svq, int svq_kick_fd);
121
+
122
+void vhost_svq_stop(VhostShadowVirtqueue *svq);
123
+
124
VhostShadowVirtqueue *vhost_svq_new(void);
125
126
void vhost_svq_free(gpointer vq);
127
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
128
index XXXXXXX..XXXXXXX 100644
129
--- a/hw/virtio/vhost-vdpa.c
130
+++ b/hw/virtio/vhost-vdpa.c
131
@@ -XXX,XX +XXX,XX @@
132
#include "hw/virtio/vhost.h"
133
#include "hw/virtio/vhost-backend.h"
134
#include "hw/virtio/virtio-net.h"
135
+#include "hw/virtio/vhost-shadow-virtqueue.h"
136
#include "hw/virtio/vhost-vdpa.h"
137
#include "exec/address-spaces.h"
138
#include "qemu/main-loop.h"
139
#include "cpu.h"
140
#include "trace.h"
141
#include "qemu-common.h"
142
+#include "qapi/error.h"
143
144
/*
145
* Return one past the end of the end of section. Be careful with uint64_t
146
@@ -XXX,XX +XXX,XX @@ static bool vhost_vdpa_one_time_request(struct vhost_dev *dev)
147
return v->index != 0;
148
}
149
150
+static int vhost_vdpa_init_svq(struct vhost_dev *hdev, struct vhost_vdpa *v,
151
+ Error **errp)
152
+{
153
+ g_autoptr(GPtrArray) shadow_vqs = NULL;
154
+
155
+ if (!v->shadow_vqs_enabled) {
156
+ return 0;
157
+ }
158
+
159
+ shadow_vqs = g_ptr_array_new_full(hdev->nvqs, vhost_svq_free);
160
+ for (unsigned n = 0; n < hdev->nvqs; ++n) {
161
+ g_autoptr(VhostShadowVirtqueue) svq = vhost_svq_new();
162
+
163
+ if (unlikely(!svq)) {
164
+ error_setg(errp, "Cannot create svq %u", n);
165
+ return -1;
166
+ }
167
+ g_ptr_array_add(shadow_vqs, g_steal_pointer(&svq));
168
+ }
169
+
170
+ v->shadow_vqs = g_steal_pointer(&shadow_vqs);
171
+ return 0;
172
+}
173
+
174
static int vhost_vdpa_init(struct vhost_dev *dev, void *opaque, Error **errp)
175
{
176
struct vhost_vdpa *v;
177
@@ -XXX,XX +XXX,XX @@ static int vhost_vdpa_init(struct vhost_dev *dev, void *opaque, Error **errp)
178
dev->opaque = opaque ;
179
v->listener = vhost_vdpa_memory_listener;
180
v->msg_type = VHOST_IOTLB_MSG_V2;
181
+ ret = vhost_vdpa_init_svq(dev, v, errp);
182
+ if (ret) {
183
+ goto err;
184
+ }
185
186
vhost_vdpa_get_iova_range(v);
187
188
@@ -XXX,XX +XXX,XX @@ static int vhost_vdpa_init(struct vhost_dev *dev, void *opaque, Error **errp)
189
VIRTIO_CONFIG_S_DRIVER);
190
191
return 0;
192
+
193
+err:
194
+ ram_block_discard_disable(false);
195
+ return ret;
196
}
197
198
static void vhost_vdpa_host_notifier_uninit(struct vhost_dev *dev,
199
@@ -XXX,XX +XXX,XX @@ static void vhost_vdpa_host_notifiers_uninit(struct vhost_dev *dev, int n)
200
201
static void vhost_vdpa_host_notifiers_init(struct vhost_dev *dev)
202
{
203
+ struct vhost_vdpa *v = dev->opaque;
204
int i;
205
206
+ if (v->shadow_vqs_enabled) {
207
+ /* FIXME SVQ is not compatible with host notifiers mr */
208
+ return;
209
+ }
210
+
211
for (i = dev->vq_index; i < dev->vq_index + dev->nvqs; i++) {
212
if (vhost_vdpa_host_notifier_init(dev, i)) {
213
goto err;
214
@@ -XXX,XX +XXX,XX @@ err:
215
return;
216
}
217
218
+static void vhost_vdpa_svq_cleanup(struct vhost_dev *dev)
219
+{
220
+ struct vhost_vdpa *v = dev->opaque;
221
+ size_t idx;
222
+
223
+ if (!v->shadow_vqs) {
224
+ return;
225
+ }
226
+
227
+ for (idx = 0; idx < v->shadow_vqs->len; ++idx) {
228
+ vhost_svq_stop(g_ptr_array_index(v->shadow_vqs, idx));
229
+ }
230
+ g_ptr_array_free(v->shadow_vqs, true);
231
+}
232
+
233
static int vhost_vdpa_cleanup(struct vhost_dev *dev)
234
{
235
struct vhost_vdpa *v;
236
@@ -XXX,XX +XXX,XX @@ static int vhost_vdpa_cleanup(struct vhost_dev *dev)
237
trace_vhost_vdpa_cleanup(dev, v);
238
vhost_vdpa_host_notifiers_uninit(dev, dev->nvqs);
239
memory_listener_unregister(&v->listener);
240
+ vhost_vdpa_svq_cleanup(dev);
241
242
dev->opaque = NULL;
243
ram_block_discard_disable(false);
244
@@ -XXX,XX +XXX,XX @@ static int vhost_vdpa_get_device_id(struct vhost_dev *dev,
245
return ret;
246
}
247
248
+static void vhost_vdpa_reset_svq(struct vhost_vdpa *v)
249
+{
250
+ if (!v->shadow_vqs_enabled) {
251
+ return;
252
+ }
253
+
254
+ for (unsigned i = 0; i < v->shadow_vqs->len; ++i) {
255
+ VhostShadowVirtqueue *svq = g_ptr_array_index(v->shadow_vqs, i);
256
+ vhost_svq_stop(svq);
257
+ }
258
+}
259
+
260
static int vhost_vdpa_reset_device(struct vhost_dev *dev)
261
{
262
+ struct vhost_vdpa *v = dev->opaque;
263
int ret;
264
uint8_t status = 0;
265
266
+ vhost_vdpa_reset_svq(v);
267
+
268
ret = vhost_vdpa_call(dev, VHOST_VDPA_SET_STATUS, &status);
269
trace_vhost_vdpa_reset_device(dev, status);
270
return ret;
271
@@ -XXX,XX +XXX,XX @@ static int vhost_vdpa_get_config(struct vhost_dev *dev, uint8_t *config,
272
return ret;
273
}
274
275
+static int vhost_vdpa_set_vring_dev_kick(struct vhost_dev *dev,
276
+ struct vhost_vring_file *file)
277
+{
278
+ trace_vhost_vdpa_set_vring_kick(dev, file->index, file->fd);
279
+ return vhost_vdpa_call(dev, VHOST_SET_VRING_KICK, file);
280
+}
281
+
282
+/**
283
+ * Set the shadow virtqueue descriptors to the device
284
+ *
285
+ * @dev: The vhost device model
286
+ * @svq: The shadow virtqueue
287
+ * @idx: The index of the virtqueue in the vhost device
288
+ * @errp: Error
289
+ */
290
+static bool vhost_vdpa_svq_setup(struct vhost_dev *dev,
291
+ VhostShadowVirtqueue *svq,
292
+ unsigned idx,
293
+ Error **errp)
294
+{
295
+ struct vhost_vring_file file = {
296
+ .index = dev->vq_index + idx,
297
+ };
298
+ const EventNotifier *event_notifier = &svq->hdev_kick;
299
+ int r;
300
+
301
+ file.fd = event_notifier_get_fd(event_notifier);
302
+ r = vhost_vdpa_set_vring_dev_kick(dev, &file);
303
+ if (unlikely(r != 0)) {
304
+ error_setg_errno(errp, -r, "Can't set device kick fd");
305
+ }
306
+
307
+ return r == 0;
308
+}
309
+
310
+static bool vhost_vdpa_svqs_start(struct vhost_dev *dev)
311
+{
312
+ struct vhost_vdpa *v = dev->opaque;
313
+ Error *err = NULL;
314
+ unsigned i;
315
+
316
+ if (!v->shadow_vqs) {
317
+ return true;
318
+ }
319
+
320
+ for (i = 0; i < v->shadow_vqs->len; ++i) {
321
+ VhostShadowVirtqueue *svq = g_ptr_array_index(v->shadow_vqs, i);
322
+ bool ok = vhost_vdpa_svq_setup(dev, svq, i, &err);
323
+ if (unlikely(!ok)) {
324
+ error_reportf_err(err, "Cannot setup SVQ %u: ", i);
325
+ return false;
326
+ }
327
+ }
328
+
329
+ return true;
330
+}
331
+
332
static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
333
{
334
struct vhost_vdpa *v = dev->opaque;
335
+ bool ok;
336
trace_vhost_vdpa_dev_start(dev, started);
337
338
if (started) {
339
vhost_vdpa_host_notifiers_init(dev);
340
+ ok = vhost_vdpa_svqs_start(dev);
341
+ if (unlikely(!ok)) {
342
+ return -1;
343
+ }
344
vhost_vdpa_set_vring_ready(dev);
27
} else {
345
} else {
28
qemu_send_packet(qemu_get_queue(s->nic), s->txp->data, s->txp->len);
346
vhost_vdpa_host_notifiers_uninit(dev, dev->nvqs);
29
}
347
@@ -XXX,XX +XXX,XX @@ static int vhost_vdpa_get_vring_base(struct vhost_dev *dev,
348
static int vhost_vdpa_set_vring_kick(struct vhost_dev *dev,
349
struct vhost_vring_file *file)
350
{
351
- trace_vhost_vdpa_set_vring_kick(dev, file->index, file->fd);
352
- return vhost_vdpa_call(dev, VHOST_SET_VRING_KICK, file);
353
+ struct vhost_vdpa *v = dev->opaque;
354
+ int vdpa_idx = file->index - dev->vq_index;
355
+
356
+ if (v->shadow_vqs_enabled) {
357
+ VhostShadowVirtqueue *svq = g_ptr_array_index(v->shadow_vqs, vdpa_idx);
358
+ vhost_svq_set_svq_kick_fd(svq, file->fd);
359
+ return 0;
360
+ } else {
361
+ return vhost_vdpa_set_vring_dev_kick(dev, file);
362
+ }
363
}
364
365
static int vhost_vdpa_set_vring_call(struct vhost_dev *dev,
366
diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
367
index XXXXXXX..XXXXXXX 100644
368
--- a/include/hw/virtio/vhost-vdpa.h
369
+++ b/include/hw/virtio/vhost-vdpa.h
370
@@ -XXX,XX +XXX,XX @@
371
#ifndef HW_VIRTIO_VHOST_VDPA_H
372
#define HW_VIRTIO_VHOST_VDPA_H
373
374
+#include <gmodule.h>
375
+
376
#include "hw/virtio/virtio.h"
377
#include "standard-headers/linux/vhost_types.h"
378
379
@@ -XXX,XX +XXX,XX @@ typedef struct vhost_vdpa {
380
bool iotlb_batch_begin_sent;
381
MemoryListener listener;
382
struct vhost_vdpa_iova_range iova_range;
383
+ bool shadow_vqs_enabled;
384
+ GPtrArray *shadow_vqs;
385
struct vhost_dev *dev;
386
VhostVDPAHostNotifier notifier[VIRTIO_QUEUE_MAX];
387
} VhostVDPA;
30
--
388
--
31
2.7.4
389
2.7.4
32
390
33
391
diff view generated by jsdifflib
1
Some NIC supports loopback mode and this is done by calling
1
From: Eugenio Pérez <eperezma@redhat.com>
2
nc->info->receive() directly which in fact suppresses the effort of
3
reentrancy check that is done in qemu_net_queue_send().
4
2
5
Unfortunately we can't use qemu_net_queue_send() here since for
3
This will make qemu aware of the device used buffers, allowing it to
6
loopback there's no sender as peer, so this patch introduce a
4
write the guest memory with its contents if needed.
7
qemu_receive_packet() which is used for implementing loopback mode
8
for a NIC with this check.
9
5
10
NIC that supports loopback mode will be converted to this helper.
6
Acked-by: Michael S. Tsirkin <mst@redhat.com>
11
7
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
12
This is intended to address CVE-2021-3416.
13
14
Cc: Prasad J Pandit <ppandit@redhat.com>
15
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
16
Cc: qemu-stable@nongnu.org
17
Signed-off-by: Jason Wang <jasowang@redhat.com>
8
Signed-off-by: Jason Wang <jasowang@redhat.com>
18
---
9
---
19
include/net/net.h | 5 +++++
10
hw/virtio/vhost-shadow-virtqueue.c | 38 ++++++++++++++++++++++++++++++++++++++
20
include/net/queue.h | 8 ++++++++
11
hw/virtio/vhost-shadow-virtqueue.h | 4 ++++
21
net/net.c | 38 +++++++++++++++++++++++++++++++-------
12
hw/virtio/vhost-vdpa.c | 31 +++++++++++++++++++++++++++++--
22
net/queue.c | 22 ++++++++++++++++++++++
13
3 files changed, 71 insertions(+), 2 deletions(-)
23
4 files changed, 66 insertions(+), 7 deletions(-)
24
14
25
diff --git a/include/net/net.h b/include/net/net.h
15
diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
26
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
27
--- a/include/net/net.h
17
--- a/hw/virtio/vhost-shadow-virtqueue.c
28
+++ b/include/net/net.h
18
+++ b/hw/virtio/vhost-shadow-virtqueue.c
29
@@ -XXX,XX +XXX,XX @@ void *qemu_get_nic_opaque(NetClientState *nc);
19
@@ -XXX,XX +XXX,XX @@ static void vhost_handle_guest_kick(EventNotifier *n)
30
void qemu_del_net_client(NetClientState *nc);
31
typedef void (*qemu_nic_foreach)(NICState *nic, void *opaque);
32
void qemu_foreach_nic(qemu_nic_foreach func, void *opaque);
33
+int qemu_can_receive_packet(NetClientState *nc);
34
int qemu_can_send_packet(NetClientState *nc);
35
ssize_t qemu_sendv_packet(NetClientState *nc, const struct iovec *iov,
36
int iovcnt);
37
ssize_t qemu_sendv_packet_async(NetClientState *nc, const struct iovec *iov,
38
int iovcnt, NetPacketSent *sent_cb);
39
ssize_t qemu_send_packet(NetClientState *nc, const uint8_t *buf, int size);
40
+ssize_t qemu_receive_packet(NetClientState *nc, const uint8_t *buf, int size);
41
+ssize_t qemu_receive_packet_iov(NetClientState *nc,
42
+ const struct iovec *iov,
43
+ int iovcnt);
44
ssize_t qemu_send_packet_raw(NetClientState *nc, const uint8_t *buf, int size);
45
ssize_t qemu_send_packet_async(NetClientState *nc, const uint8_t *buf,
46
int size, NetPacketSent *sent_cb);
47
diff --git a/include/net/queue.h b/include/net/queue.h
48
index XXXXXXX..XXXXXXX 100644
49
--- a/include/net/queue.h
50
+++ b/include/net/queue.h
51
@@ -XXX,XX +XXX,XX @@ void qemu_net_queue_append_iov(NetQueue *queue,
52
53
void qemu_del_net_queue(NetQueue *queue);
54
55
+ssize_t qemu_net_queue_receive(NetQueue *queue,
56
+ const uint8_t *data,
57
+ size_t size);
58
+
59
+ssize_t qemu_net_queue_receive_iov(NetQueue *queue,
60
+ const struct iovec *iov,
61
+ int iovcnt);
62
+
63
ssize_t qemu_net_queue_send(NetQueue *queue,
64
NetClientState *sender,
65
unsigned flags,
66
diff --git a/net/net.c b/net/net.c
67
index XXXXXXX..XXXXXXX 100644
68
--- a/net/net.c
69
+++ b/net/net.c
70
@@ -XXX,XX +XXX,XX @@ int qemu_set_vnet_be(NetClientState *nc, bool is_be)
71
#endif
72
}
20
}
73
21
74
+int qemu_can_receive_packet(NetClientState *nc)
22
/**
23
+ * Forward vhost notifications
24
+ *
25
+ * @n: hdev call event notifier, the one that device set to notify svq.
26
+ */
27
+static void vhost_svq_handle_call(EventNotifier *n)
75
+{
28
+{
76
+ if (nc->receive_disabled) {
29
+ VhostShadowVirtqueue *svq = container_of(n, VhostShadowVirtqueue,
77
+ return 0;
30
+ hdev_call);
78
+ } else if (nc->info->can_receive &&
31
+ event_notifier_test_and_clear(n);
79
+ !nc->info->can_receive(nc)) {
32
+ event_notifier_set(&svq->svq_call);
80
+ return 0;
81
+ }
82
+ return 1;
83
+}
33
+}
84
+
34
+
85
int qemu_can_send_packet(NetClientState *sender)
35
+/**
86
{
36
+ * Set the call notifier for the SVQ to call the guest
87
int vm_running = runstate_is_running();
37
+ *
88
@@ -XXX,XX +XXX,XX @@ int qemu_can_send_packet(NetClientState *sender)
38
+ * @svq: Shadow virtqueue
89
return 1;
39
+ * @call_fd: call notifier
40
+ *
41
+ * Called on BQL context.
42
+ */
43
+void vhost_svq_set_svq_call_fd(VhostShadowVirtqueue *svq, int call_fd)
44
+{
45
+ if (call_fd == VHOST_FILE_UNBIND) {
46
+ /*
47
+ * Fail event_notifier_set if called handling device call.
48
+ *
49
+ * SVQ still needs device notifications, since it needs to keep
50
+ * forwarding used buffers even with the unbind.
51
+ */
52
+ memset(&svq->svq_call, 0, sizeof(svq->svq_call));
53
+ } else {
54
+ event_notifier_init_fd(&svq->svq_call, call_fd);
55
+ }
56
+}
57
+
58
+/**
59
* Set a new file descriptor for the guest to kick the SVQ and notify for avail
60
*
61
* @svq: The svq
62
@@ -XXX,XX +XXX,XX @@ VhostShadowVirtqueue *vhost_svq_new(void)
90
}
63
}
91
64
92
- if (sender->peer->receive_disabled) {
65
event_notifier_init_fd(&svq->svq_kick, VHOST_FILE_UNBIND);
93
- return 0;
66
+ event_notifier_set_handler(&svq->hdev_call, vhost_svq_handle_call);
94
- } else if (sender->peer->info->can_receive &&
67
return g_steal_pointer(&svq);
95
- !sender->peer->info->can_receive(sender->peer)) {
68
96
- return 0;
69
err_init_hdev_call:
97
- }
70
@@ -XXX,XX +XXX,XX @@ void vhost_svq_free(gpointer pvq)
98
- return 1;
71
VhostShadowVirtqueue *vq = pvq;
99
+ return qemu_can_receive_packet(sender->peer);
72
vhost_svq_stop(vq);
73
event_notifier_cleanup(&vq->hdev_kick);
74
+ event_notifier_set_handler(&vq->hdev_call, NULL);
75
event_notifier_cleanup(&vq->hdev_call);
76
g_free(vq);
100
}
77
}
101
78
diff --git a/hw/virtio/vhost-shadow-virtqueue.h b/hw/virtio/vhost-shadow-virtqueue.h
102
static ssize_t filter_receive_iov(NetClientState *nc,
79
index XXXXXXX..XXXXXXX 100644
103
@@ -XXX,XX +XXX,XX @@ ssize_t qemu_send_packet(NetClientState *nc, const uint8_t *buf, int size)
80
--- a/hw/virtio/vhost-shadow-virtqueue.h
104
return qemu_send_packet_async(nc, buf, size, NULL);
81
+++ b/hw/virtio/vhost-shadow-virtqueue.h
82
@@ -XXX,XX +XXX,XX @@ typedef struct VhostShadowVirtqueue {
83
* So shadow virtqueue must not clean it, or we would lose VirtQueue one.
84
*/
85
EventNotifier svq_kick;
86
+
87
+ /* Guest's call notifier, where the SVQ calls guest. */
88
+ EventNotifier svq_call;
89
} VhostShadowVirtqueue;
90
91
void vhost_svq_set_svq_kick_fd(VhostShadowVirtqueue *svq, int svq_kick_fd);
92
+void vhost_svq_set_svq_call_fd(VhostShadowVirtqueue *svq, int call_fd);
93
94
void vhost_svq_stop(VhostShadowVirtqueue *svq);
95
96
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
97
index XXXXXXX..XXXXXXX 100644
98
--- a/hw/virtio/vhost-vdpa.c
99
+++ b/hw/virtio/vhost-vdpa.c
100
@@ -XXX,XX +XXX,XX @@ static int vhost_vdpa_set_vring_dev_kick(struct vhost_dev *dev,
101
return vhost_vdpa_call(dev, VHOST_SET_VRING_KICK, file);
105
}
102
}
106
103
107
+ssize_t qemu_receive_packet(NetClientState *nc, const uint8_t *buf, int size)
104
+static int vhost_vdpa_set_vring_dev_call(struct vhost_dev *dev,
105
+ struct vhost_vring_file *file)
108
+{
106
+{
109
+ if (!qemu_can_receive_packet(nc)) {
107
+ trace_vhost_vdpa_set_vring_call(dev, file->index, file->fd);
110
+ return 0;
108
+ return vhost_vdpa_call(dev, VHOST_SET_VRING_CALL, file);
109
+}
110
+
111
/**
112
* Set the shadow virtqueue descriptors to the device
113
*
114
@@ -XXX,XX +XXX,XX @@ static int vhost_vdpa_set_vring_dev_kick(struct vhost_dev *dev,
115
* @svq: The shadow virtqueue
116
* @idx: The index of the virtqueue in the vhost device
117
* @errp: Error
118
+ *
119
+ * Note that this function does not rewind kick file descriptor if cannot set
120
+ * call one.
121
*/
122
static bool vhost_vdpa_svq_setup(struct vhost_dev *dev,
123
VhostShadowVirtqueue *svq,
124
@@ -XXX,XX +XXX,XX @@ static bool vhost_vdpa_svq_setup(struct vhost_dev *dev,
125
r = vhost_vdpa_set_vring_dev_kick(dev, &file);
126
if (unlikely(r != 0)) {
127
error_setg_errno(errp, -r, "Can't set device kick fd");
128
+ return false;
111
+ }
129
+ }
112
+
130
+
113
+ return qemu_net_queue_receive(nc->incoming_queue, buf, size);
131
+ event_notifier = &svq->hdev_call;
114
+}
132
+ file.fd = event_notifier_get_fd(event_notifier);
133
+ r = vhost_vdpa_set_vring_dev_call(dev, &file);
134
+ if (unlikely(r != 0)) {
135
+ error_setg_errno(errp, -r, "Can't set device call fd");
136
}
137
138
return r == 0;
139
@@ -XXX,XX +XXX,XX @@ static int vhost_vdpa_set_vring_kick(struct vhost_dev *dev,
140
static int vhost_vdpa_set_vring_call(struct vhost_dev *dev,
141
struct vhost_vring_file *file)
142
{
143
- trace_vhost_vdpa_set_vring_call(dev, file->index, file->fd);
144
- return vhost_vdpa_call(dev, VHOST_SET_VRING_CALL, file);
145
+ struct vhost_vdpa *v = dev->opaque;
115
+
146
+
116
+ssize_t qemu_receive_packet_iov(NetClientState *nc, const struct iovec *iov,
147
+ if (v->shadow_vqs_enabled) {
117
+ int iovcnt)
148
+ int vdpa_idx = file->index - dev->vq_index;
118
+{
149
+ VhostShadowVirtqueue *svq = g_ptr_array_index(v->shadow_vqs, vdpa_idx);
119
+ if (!qemu_can_receive_packet(nc)) {
150
+
151
+ vhost_svq_set_svq_call_fd(svq, file->fd);
120
+ return 0;
152
+ return 0;
153
+ } else {
154
+ return vhost_vdpa_set_vring_dev_call(dev, file);
121
+ }
155
+ }
122
+
123
+ return qemu_net_queue_receive_iov(nc->incoming_queue, iov, iovcnt);
124
+}
125
+
126
ssize_t qemu_send_packet_raw(NetClientState *nc, const uint8_t *buf, int size)
127
{
128
return qemu_send_packet_async_with_flags(nc, QEMU_NET_PACKET_FLAG_RAW,
129
diff --git a/net/queue.c b/net/queue.c
130
index XXXXXXX..XXXXXXX 100644
131
--- a/net/queue.c
132
+++ b/net/queue.c
133
@@ -XXX,XX +XXX,XX @@ static ssize_t qemu_net_queue_deliver_iov(NetQueue *queue,
134
return ret;
135
}
156
}
136
157
137
+ssize_t qemu_net_queue_receive(NetQueue *queue,
158
static int vhost_vdpa_get_features(struct vhost_dev *dev,
138
+ const uint8_t *data,
139
+ size_t size)
140
+{
141
+ if (queue->delivering) {
142
+ return 0;
143
+ }
144
+
145
+ return qemu_net_queue_deliver(queue, NULL, 0, data, size);
146
+}
147
+
148
+ssize_t qemu_net_queue_receive_iov(NetQueue *queue,
149
+ const struct iovec *iov,
150
+ int iovcnt)
151
+{
152
+ if (queue->delivering) {
153
+ return 0;
154
+ }
155
+
156
+ return qemu_net_queue_deliver_iov(queue, NULL, 0, iov, iovcnt);
157
+}
158
+
159
ssize_t qemu_net_queue_send(NetQueue *queue,
160
NetClientState *sender,
161
unsigned flags,
162
--
159
--
163
2.7.4
160
2.7.4
164
161
165
162
diff view generated by jsdifflib
1
From: Alexey Kirillov <lekiravi@yandex-team.ru>
1
From: Eugenio Pérez <eperezma@redhat.com>
2
2
3
The query-netdev command is used to get the configuration of the current
3
This allows SVQ to negotiate features with the guest and the device. For
4
network device backends (netdevs).
4
the device, SVQ is a driver. While this function bypasses all
5
This is the QMP analog of the HMP command "info network" but only for
5
non-transport features, it needs to disable the features that SVQ does
6
netdevs (i.e. excluding NIC and hubports).
6
not support when forwarding buffers. This includes packed vq layout,
7
indirect descriptors or event idx.
7
8
8
The query-netdev command returns an array of objects of the NetdevInfo
9
Future changes can add support to offer more features to the guest,
9
type, which are an extension of Netdev type. It means that response can
10
since the use of VirtQueue gives this for free. This is left out at the
10
be used for netdev-add after small modification. This can be useful for
11
moment for simplicity.
11
recreate the same netdev configuration.
12
12
13
Information about the network device is filled in when it is created or
13
Acked-by: Michael S. Tsirkin <mst@redhat.com>
14
modified and is available through the NetClientState->stored_config.
14
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
15
16
Signed-off-by: Alexey Kirillov <lekiravi@yandex-team.ru>
17
Acked-by: Markus Armbruster <armbru@redhat.com>
18
Signed-off-by: Jason Wang <jasowang@redhat.com>
15
Signed-off-by: Jason Wang <jasowang@redhat.com>
19
---
16
---
20
include/net/net.h | 3 ++
17
hw/virtio/vhost-shadow-virtqueue.c | 44 ++++++++++++++++++++++++++++++++++++++
21
net/l2tpv3.c | 7 ++++
18
hw/virtio/vhost-shadow-virtqueue.h | 2 ++
22
net/net.c | 30 +++++++++++++-
19
hw/virtio/vhost-vdpa.c | 15 +++++++++++++
23
net/netmap.c | 7 ++++
20
3 files changed, 61 insertions(+)
24
net/slirp.c | 122 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
25
net/socket.c | 71 +++++++++++++++++++++++++++++++
26
net/tap-win32.c | 9 ++++
27
net/tap.c | 103 ++++++++++++++++++++++++++++++++++++++++++---
28
net/vde.c | 22 ++++++++++
29
net/vhost-user.c | 18 ++++++--
30
net/vhost-vdpa.c | 14 +++++++
31
qapi/net.json | 80 +++++++++++++++++++++++++++++++++++
32
12 files changed, 477 insertions(+), 9 deletions(-)
33
21
34
diff --git a/include/net/net.h b/include/net/net.h
22
diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
35
index XXXXXXX..XXXXXXX 100644
23
index XXXXXXX..XXXXXXX 100644
36
--- a/include/net/net.h
24
--- a/hw/virtio/vhost-shadow-virtqueue.c
37
+++ b/include/net/net.h
25
+++ b/hw/virtio/vhost-shadow-virtqueue.c
38
@@ -XXX,XX +XXX,XX @@
26
@@ -XXX,XX +XXX,XX @@
39
#include "qapi/qapi-types-net.h"
27
#include "hw/virtio/vhost-shadow-virtqueue.h"
40
#include "net/queue.h"
28
41
#include "hw/qdev-properties-system.h"
29
#include "qemu/error-report.h"
42
+#include "qapi/clone-visitor.h"
30
+#include "qapi/error.h"
43
+#include "qapi/qapi-visit-net.h"
31
#include "qemu/main-loop.h"
44
32
#include "linux-headers/linux/vhost.h"
45
#define MAC_FMT "%02X:%02X:%02X:%02X:%02X:%02X"
33
46
#define MAC_ARG(x) ((uint8_t *)(x))[0], ((uint8_t *)(x))[1], \
34
/**
47
@@ -XXX,XX +XXX,XX @@ struct NetClientState {
35
+ * Validate the transport device features that both guests can use with the SVQ
48
char *model;
36
+ * and SVQs can use with the device.
49
char *name;
37
+ *
50
char info_str[256];
38
+ * @dev_features: The features
51
+ NetdevInfo *stored_config;
39
+ * @errp: Error pointer
52
unsigned receive_disabled : 1;
40
+ */
53
NetClientDestructor *destructor;
41
+bool vhost_svq_valid_features(uint64_t features, Error **errp)
54
unsigned int queue_index;
42
+{
55
diff --git a/net/l2tpv3.c b/net/l2tpv3.c
43
+ bool ok = true;
56
index XXXXXXX..XXXXXXX 100644
44
+ uint64_t svq_features = features;
57
--- a/net/l2tpv3.c
58
+++ b/net/l2tpv3.c
59
@@ -XXX,XX +XXX,XX @@ int net_init_l2tpv3(const Netdev *netdev,
60
61
l2tpv3_read_poll(s, true);
62
63
+ /* Store startup parameters */
64
+ nc->stored_config = g_new0(NetdevInfo, 1);
65
+ nc->stored_config->type = NET_BACKEND_L2TPV3;
66
+
45
+
67
+ QAPI_CLONE_MEMBERS(NetdevL2TPv3Options,
46
+ for (uint64_t b = VIRTIO_TRANSPORT_F_START; b <= VIRTIO_TRANSPORT_F_END;
68
+ &nc->stored_config->u.l2tpv3, l2tpv3);
47
+ ++b) {
48
+ switch (b) {
49
+ case VIRTIO_F_ANY_LAYOUT:
50
+ continue;
69
+
51
+
70
snprintf(s->nc.info_str, sizeof(s->nc.info_str),
52
+ case VIRTIO_F_ACCESS_PLATFORM:
71
"l2tpv3: connected");
53
+ /* SVQ trust in the host's IOMMU to translate addresses */
72
return 0;
54
+ case VIRTIO_F_VERSION_1:
73
diff --git a/net/net.c b/net/net.c
55
+ /* SVQ trust that the guest vring is little endian */
74
index XXXXXXX..XXXXXXX 100644
56
+ if (!(svq_features & BIT_ULL(b))) {
75
--- a/net/net.c
57
+ set_bit(b, &svq_features);
76
+++ b/net/net.c
58
+ ok = false;
77
@@ -XXX,XX +XXX,XX @@
59
+ }
78
#include "monitor/monitor.h"
60
+ continue;
79
#include "qemu/help_option.h"
80
#include "qapi/qapi-commands-net.h"
81
-#include "qapi/qapi-visit-net.h"
82
#include "qapi/qmp/qdict.h"
83
#include "qapi/qmp/qerror.h"
84
#include "qemu/error-report.h"
85
@@ -XXX,XX +XXX,XX @@ static void qemu_free_net_client(NetClientState *nc)
86
}
87
g_free(nc->name);
88
g_free(nc->model);
89
+ qapi_free_NetdevInfo(nc->stored_config);
90
if (nc->destructor) {
91
nc->destructor(nc);
92
}
93
@@ -XXX,XX +XXX,XX @@ RxFilterInfoList *qmp_query_rx_filter(bool has_name, const char *name,
94
return filter_list;
95
}
96
97
+NetdevInfoList *qmp_query_netdev(Error **errp)
98
+{
99
+ NetdevInfoList *list = NULL;
100
+ NetClientState *nc;
101
+
61
+
102
+ QTAILQ_FOREACH(nc, &net_clients, next) {
62
+ default:
103
+ /*
63
+ if (svq_features & BIT_ULL(b)) {
104
+ * Only look at netdevs (backend network devices), not for each queue
64
+ clear_bit(b, &svq_features);
105
+ * or NIC / hubport
65
+ ok = false;
106
+ */
107
+ if (nc->stored_config) {
108
+ NetdevInfo *element = QAPI_CLONE(NetdevInfo, nc->stored_config);
109
+
110
+ g_free(element->id); /* Need to dealloc empty id after clone */
111
+ element->id = g_strdup(nc->name);
112
+
113
+ element->has_peer_id = nc->peer != NULL;
114
+ if (element->has_peer_id) {
115
+ element->peer_id = g_strdup(nc->peer->name);
116
+ }
66
+ }
117
+
118
+ QAPI_LIST_PREPEND(list, element);
119
+ }
67
+ }
120
+ }
68
+ }
121
+
69
+
122
+ return list;
70
+ if (!ok) {
71
+ error_setg(errp, "SVQ Invalid device feature flags, offer: 0x%"PRIx64
72
+ ", ok: 0x%"PRIx64, features, svq_features);
73
+ }
74
+ return ok;
123
+}
75
+}
124
+
76
+
125
void hmp_info_network(Monitor *mon, const QDict *qdict)
77
+/**
78
* Forward guest notifications.
79
*
80
* @n: guest kick event notifier, the one that guest set to notify svq.
81
diff --git a/hw/virtio/vhost-shadow-virtqueue.h b/hw/virtio/vhost-shadow-virtqueue.h
82
index XXXXXXX..XXXXXXX 100644
83
--- a/hw/virtio/vhost-shadow-virtqueue.h
84
+++ b/hw/virtio/vhost-shadow-virtqueue.h
85
@@ -XXX,XX +XXX,XX @@ typedef struct VhostShadowVirtqueue {
86
EventNotifier svq_call;
87
} VhostShadowVirtqueue;
88
89
+bool vhost_svq_valid_features(uint64_t features, Error **errp);
90
+
91
void vhost_svq_set_svq_kick_fd(VhostShadowVirtqueue *svq, int svq_kick_fd);
92
void vhost_svq_set_svq_call_fd(VhostShadowVirtqueue *svq, int call_fd);
93
94
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
95
index XXXXXXX..XXXXXXX 100644
96
--- a/hw/virtio/vhost-vdpa.c
97
+++ b/hw/virtio/vhost-vdpa.c
98
@@ -XXX,XX +XXX,XX @@ static int vhost_vdpa_init_svq(struct vhost_dev *hdev, struct vhost_vdpa *v,
99
Error **errp)
126
{
100
{
127
NetClientState *nc, *peer;
101
g_autoptr(GPtrArray) shadow_vqs = NULL;
128
diff --git a/net/netmap.c b/net/netmap.c
102
+ uint64_t dev_features, svq_features;
129
index XXXXXXX..XXXXXXX 100644
103
+ int r;
130
--- a/net/netmap.c
104
+ bool ok;
131
+++ b/net/netmap.c
105
132
@@ -XXX,XX +XXX,XX @@ int net_init_netmap(const Netdev *netdev,
106
if (!v->shadow_vqs_enabled) {
133
pstrcpy(s->ifname, sizeof(s->ifname), netmap_opts->ifname);
107
return 0;
134
netmap_read_poll(s, true); /* Initially only poll for reads. */
108
}
135
109
136
+ /* Store startup parameters */
110
+ r = hdev->vhost_ops->vhost_get_features(hdev, &dev_features);
137
+ nc->stored_config = g_new0(NetdevInfo, 1);
111
+ if (r != 0) {
138
+ nc->stored_config->type = NET_BACKEND_NETMAP;
112
+ error_setg_errno(errp, -r, "Can't get vdpa device features");
139
+
113
+ return r;
140
+ QAPI_CLONE_MEMBERS(NetdevNetmapOptions,
141
+ &nc->stored_config->u.netmap, netmap_opts);
142
+
143
return 0;
144
}
145
146
diff --git a/net/slirp.c b/net/slirp.c
147
index XXXXXXX..XXXXXXX 100644
148
--- a/net/slirp.c
149
+++ b/net/slirp.c
150
@@ -XXX,XX +XXX,XX @@ static int net_slirp_init(NetClientState *peer, const char *model,
151
int shift;
152
char *end;
153
struct slirp_config_str *config;
154
+ NetdevUserOptions *stored;
155
+ StringList **stored_hostfwd;
156
+ StringList **stored_guestfwd;
157
158
if (!ipv4 && (vnetwork || vhost || vnameserver)) {
159
error_setg(errp, "IPv4 disabled but netmask/host/dns provided");
160
@@ -XXX,XX +XXX,XX @@ static int net_slirp_init(NetClientState *peer, const char *model,
161
162
nc = qemu_new_net_client(&net_slirp_info, peer, model, name);
163
164
+ /* Store startup parameters */
165
+ nc->stored_config = g_new0(NetdevInfo, 1);
166
+ nc->stored_config->type = NET_BACKEND_USER;
167
+ stored = &nc->stored_config->u.user;
168
+
169
+ if (vhostname) {
170
+ stored->has_hostname = true;
171
+ stored->hostname = g_strdup(vhostname);
172
+ }
114
+ }
173
+
115
+
174
+ stored->has_q_restrict = true;
116
+ svq_features = dev_features;
175
+ stored->q_restrict = restricted;
117
+ ok = vhost_svq_valid_features(svq_features, errp);
176
+
118
+ if (unlikely(!ok)) {
177
+ stored->has_ipv4 = true;
119
+ return -1;
178
+ stored->ipv4 = ipv4;
179
+
180
+ stored->has_ipv6 = true;
181
+ stored->ipv6 = ipv6;
182
+
183
+ if (ipv4) {
184
+ uint8_t *net_bytes = (uint8_t *)&net;
185
+ uint8_t *mask_bytes = (uint8_t *)&mask;
186
+
187
+ stored->has_net = true;
188
+ stored->net = g_strdup_printf("%d.%d.%d.%d/%d.%d.%d.%d",
189
+ net_bytes[0], net_bytes[1],
190
+ net_bytes[2], net_bytes[3],
191
+ mask_bytes[0], mask_bytes[1],
192
+ mask_bytes[2], mask_bytes[3]);
193
+
194
+ stored->has_host = true;
195
+ stored->host = g_strdup(inet_ntoa(host));
196
+ }
120
+ }
197
+
121
+
198
+ if (tftp_export) {
122
shadow_vqs = g_ptr_array_new_full(hdev->nvqs, vhost_svq_free);
199
+ stored->has_tftp = true;
123
for (unsigned n = 0; n < hdev->nvqs; ++n) {
200
+ stored->tftp = g_strdup(tftp_export);
124
g_autoptr(VhostShadowVirtqueue) svq = vhost_svq_new();
201
+ }
202
+
203
+ if (bootfile) {
204
+ stored->has_bootfile = true;
205
+ stored->bootfile = g_strdup(bootfile);
206
+ }
207
+
208
+ if (vdhcp_start) {
209
+ stored->has_dhcpstart = true;
210
+ stored->dhcpstart = g_strdup(vdhcp_start);
211
+ }
212
+
213
+ if (ipv4) {
214
+ stored->has_dns = true;
215
+ stored->dns = g_strdup(inet_ntoa(dns));
216
+ }
217
+
218
+ if (dnssearch) {
219
+ stored->has_dnssearch = true;
220
+ StringList **stored_list = &stored->dnssearch;
221
+
222
+ for (int i = 0; dnssearch[i]; i++) {
223
+ String *element = g_new0(String, 1);
224
+
225
+ element->str = g_strdup(dnssearch[i]);
226
+ QAPI_LIST_APPEND(stored_list, element);
227
+ }
228
+ }
229
+
230
+ if (vdomainname) {
231
+ stored->has_domainname = true;
232
+ stored->domainname = g_strdup(vdomainname);
233
+ }
234
+
235
+ if (ipv6) {
236
+ char addrstr[INET6_ADDRSTRLEN];
237
+ const char *res;
238
+
239
+ stored->has_ipv6_prefix = true;
240
+ stored->ipv6_prefix = g_strdup(vprefix6);
241
+
242
+ stored->has_ipv6_prefixlen = true;
243
+ stored->ipv6_prefixlen = vprefix6_len;
244
+
245
+ res = inet_ntop(AF_INET6, &ip6_host,
246
+ addrstr, sizeof(addrstr));
247
+
248
+ stored->has_ipv6_host = true;
249
+ stored->ipv6_host = g_strdup(res);
250
+
251
+ res = inet_ntop(AF_INET6, &ip6_dns,
252
+ addrstr, sizeof(addrstr));
253
+
254
+ stored->has_ipv6_dns = true;
255
+ stored->ipv6_dns = g_strdup(res);
256
+ }
257
+
258
+ if (smb_export) {
259
+ stored->has_smb = true;
260
+ stored->smb = g_strdup(smb_export);
261
+ }
262
+
263
+ if (vsmbserver) {
264
+ stored->has_smbserver = true;
265
+ stored->smbserver = g_strdup(vsmbserver);
266
+ }
267
+
268
+ if (tftp_server_name) {
269
+ stored->has_tftp_server_name = true;
270
+ stored->tftp_server_name = g_strdup(tftp_server_name);
271
+ }
272
+
273
snprintf(nc->info_str, sizeof(nc->info_str),
274
"net=%s,restrict=%s", inet_ntoa(net),
275
restricted ? "on" : "off");
276
@@ -XXX,XX +XXX,XX @@ static int net_slirp_init(NetClientState *peer, const char *model,
277
s->poll_notifier.notify = net_slirp_poll_notify;
278
main_loop_poll_add_notifier(&s->poll_notifier);
279
280
+ stored_hostfwd = &stored->hostfwd;
281
+ stored_guestfwd = &stored->guestfwd;
282
+
283
for (config = slirp_configs; config; config = config->next) {
284
+ String *element = g_new0(String, 1);
285
+
286
+ element->str = g_strdup(config->str);
287
if (config->flags & SLIRP_CFG_HOSTFWD) {
288
if (slirp_hostfwd(s, config->str, errp) < 0) {
289
goto error;
290
}
291
+ stored->has_hostfwd = true;
292
+ QAPI_LIST_APPEND(stored_hostfwd, element);
293
} else {
294
if (slirp_guestfwd(s, config->str, errp) < 0) {
295
goto error;
296
}
297
+ stored->has_guestfwd = true;
298
+ QAPI_LIST_APPEND(stored_guestfwd, element);
299
}
300
}
301
#ifndef _WIN32
302
diff --git a/net/socket.c b/net/socket.c
303
index XXXXXXX..XXXXXXX 100644
304
--- a/net/socket.c
305
+++ b/net/socket.c
306
@@ -XXX,XX +XXX,XX @@ static NetSocketState *net_socket_fd_init_dgram(NetClientState *peer,
307
NetSocketState *s;
308
SocketAddress *sa;
309
SocketAddressType sa_type;
310
+ NetdevSocketOptions *stored;
311
312
sa = socket_local_address(fd, errp);
313
if (!sa) {
314
@@ -XXX,XX +XXX,XX @@ static NetSocketState *net_socket_fd_init_dgram(NetClientState *peer,
315
net_socket_rs_init(&s->rs, net_socket_rs_finalize, false);
316
net_socket_read_poll(s, true);
317
318
+ /* Store startup parameters */
319
+ nc->stored_config = g_new0(NetdevInfo, 1);
320
+ nc->stored_config->type = NET_BACKEND_SOCKET;
321
+ stored = &nc->stored_config->u.socket;
322
+
323
+ stored->has_fd = true;
324
+ stored->fd = g_strdup_printf("%d", fd);
325
+
326
/* mcast: save bound address as dst */
327
if (is_connected && mcast != NULL) {
328
+ stored->has_mcast = true;
329
+ stored->mcast = g_strdup(mcast);
330
+
331
s->dgram_dst = saddr;
332
snprintf(nc->info_str, sizeof(nc->info_str),
333
"socket: fd=%d (cloned mcast=%s:%d)",
334
@@ -XXX,XX +XXX,XX @@ static NetSocketState *net_socket_fd_init_stream(NetClientState *peer,
335
{
336
NetClientState *nc;
337
NetSocketState *s;
338
+ NetdevSocketOptions *stored;
339
340
nc = qemu_new_net_client(&net_socket_info, peer, model, name);
341
342
@@ -XXX,XX +XXX,XX @@ static NetSocketState *net_socket_fd_init_stream(NetClientState *peer,
343
} else {
344
qemu_set_fd_handler(s->fd, NULL, net_socket_connect, s);
345
}
346
+
347
+ /* Store startup parameters */
348
+ nc->stored_config = g_new0(NetdevInfo, 1);
349
+ nc->stored_config->type = NET_BACKEND_SOCKET;
350
+ stored = &nc->stored_config->u.socket;
351
+
352
+ stored->has_fd = true;
353
+ stored->fd = g_strdup_printf("%d", fd);
354
+
355
return s;
356
}
357
358
@@ -XXX,XX +XXX,XX @@ static void net_socket_accept(void *opaque)
359
struct sockaddr_in saddr;
360
socklen_t len;
361
int fd;
362
+ NetdevSocketOptions *stored;
363
364
for(;;) {
365
len = sizeof(saddr);
366
@@ -XXX,XX +XXX,XX @@ static void net_socket_accept(void *opaque)
367
s->fd = fd;
368
s->nc.link_down = false;
369
net_socket_connect(s);
370
+
371
+ /* Store additional startup parameters (extend net_socket_listen_init) */
372
+ stored = &s->nc.stored_config->u.socket;
373
+
374
+ stored->has_fd = true;
375
+ stored->fd = g_strdup_printf("%d", fd);
376
+
377
snprintf(s->nc.info_str, sizeof(s->nc.info_str),
378
"socket: connection from %s:%d",
379
inet_ntoa(saddr.sin_addr), ntohs(saddr.sin_port));
380
@@ -XXX,XX +XXX,XX @@ static int net_socket_listen_init(NetClientState *peer,
381
NetSocketState *s;
382
struct sockaddr_in saddr;
383
int fd, ret;
384
+ NetdevSocketOptions *stored;
385
386
if (parse_host_port(&saddr, host_str, errp) < 0) {
387
return -1;
388
@@ -XXX,XX +XXX,XX @@ static int net_socket_listen_init(NetClientState *peer,
389
net_socket_rs_init(&s->rs, net_socket_rs_finalize, false);
390
391
qemu_set_fd_handler(s->listen_fd, net_socket_accept, NULL, s);
392
+
393
+ /* Store startup parameters */
394
+ nc->stored_config = g_new0(NetdevInfo, 1);
395
+ nc->stored_config->type = NET_BACKEND_SOCKET;
396
+ stored = &nc->stored_config->u.socket;
397
+
398
+ stored->has_listen = true;
399
+ stored->listen = g_strdup(host_str);
400
+
401
return 0;
402
}
403
404
@@ -XXX,XX +XXX,XX @@ static int net_socket_connect_init(NetClientState *peer,
405
NetSocketState *s;
406
int fd, connected, ret;
407
struct sockaddr_in saddr;
408
+ NetdevSocketOptions *stored;
409
410
if (parse_host_port(&saddr, host_str, errp) < 0) {
411
return -1;
412
@@ -XXX,XX +XXX,XX @@ static int net_socket_connect_init(NetClientState *peer,
413
return -1;
414
}
415
416
+ /* Store additional startup parameters (extend net_socket_fd_init) */
417
+ stored = &s->nc.stored_config->u.socket;
418
+
419
+ stored->has_connect = true;
420
+ stored->connect = g_strdup(host_str);
421
+
422
snprintf(s->nc.info_str, sizeof(s->nc.info_str),
423
"socket: connect to %s:%d",
424
inet_ntoa(saddr.sin_addr), ntohs(saddr.sin_port));
425
@@ -XXX,XX +XXX,XX @@ static int net_socket_mcast_init(NetClientState *peer,
426
int fd;
427
struct sockaddr_in saddr;
428
struct in_addr localaddr, *param_localaddr;
429
+ NetdevSocketOptions *stored;
430
431
if (parse_host_port(&saddr, host_str, errp) < 0) {
432
return -1;
433
@@ -XXX,XX +XXX,XX @@ static int net_socket_mcast_init(NetClientState *peer,
434
435
s->dgram_dst = saddr;
436
437
+ /* Store additional startup parameters (extend net_socket_fd_init) */
438
+ stored = &s->nc.stored_config->u.socket;
439
+
440
+ if (!stored->has_mcast) {
441
+ stored->has_mcast = true;
442
+ stored->mcast = g_strdup(host_str);
443
+ }
444
+
445
+ if (localaddr_str) {
446
+ stored->has_localaddr = true;
447
+ stored->localaddr = g_strdup(localaddr_str);
448
+ }
449
+
450
snprintf(s->nc.info_str, sizeof(s->nc.info_str),
451
"socket: mcast=%s:%d",
452
inet_ntoa(saddr.sin_addr), ntohs(saddr.sin_port));
453
@@ -XXX,XX +XXX,XX @@ static int net_socket_udp_init(NetClientState *peer,
454
NetSocketState *s;
455
int fd, ret;
456
struct sockaddr_in laddr, raddr;
457
+ NetdevSocketOptions *stored;
458
459
if (parse_host_port(&laddr, lhost, errp) < 0) {
460
return -1;
461
@@ -XXX,XX +XXX,XX @@ static int net_socket_udp_init(NetClientState *peer,
462
463
s->dgram_dst = raddr;
464
465
+ /* Store additional startup parameters (extend net_socket_fd_init) */
466
+ stored = &s->nc.stored_config->u.socket;
467
+
468
+ stored->has_localaddr = true;
469
+ stored->localaddr = g_strdup(lhost);
470
+
471
+ stored->has_udp = true;
472
+ stored->udp = g_strdup(rhost);
473
+
474
snprintf(s->nc.info_str, sizeof(s->nc.info_str),
475
"socket: udp=%s:%d",
476
inet_ntoa(raddr.sin_addr), ntohs(raddr.sin_port));
477
diff --git a/net/tap-win32.c b/net/tap-win32.c
478
index XXXXXXX..XXXXXXX 100644
479
--- a/net/tap-win32.c
480
+++ b/net/tap-win32.c
481
@@ -XXX,XX +XXX,XX @@ static int tap_win32_init(NetClientState *peer, const char *model,
482
NetClientState *nc;
483
TAPState *s;
484
tap_win32_overlapped_t *handle;
485
+ NetdevTapOptions *stored;
486
487
if (tap_win32_open(&handle, ifname) < 0) {
488
printf("tap: Could not open '%s'\n", ifname);
489
@@ -XXX,XX +XXX,XX @@ static int tap_win32_init(NetClientState *peer, const char *model,
490
491
s = DO_UPCAST(TAPState, nc, nc);
492
493
+ /* Store startup parameters */
494
+ nc->stored_config = g_new0(NetdevInfo, 1);
495
+ nc->stored_config->type = NET_BACKEND_TAP;
496
+ stored = &nc->stored_config->u.tap;
497
+
498
+ stored->has_ifname = true;
499
+ stored->ifname = g_strdup(ifname);
500
+
501
snprintf(s->nc.info_str, sizeof(s->nc.info_str),
502
"tap: ifname=%s", ifname);
503
504
diff --git a/net/tap.c b/net/tap.c
505
index XXXXXXX..XXXXXXX 100644
506
--- a/net/tap.c
507
+++ b/net/tap.c
508
@@ -XXX,XX +XXX,XX @@ int net_init_bridge(const Netdev *netdev, const char *name,
509
const char *helper, *br;
510
TAPState *s;
511
int fd, vnet_hdr;
512
+ NetdevBridgeOptions *stored;
513
514
assert(netdev->type == NET_CLIENT_DRIVER_BRIDGE);
515
bridge = &netdev->u.bridge;
516
@@ -XXX,XX +XXX,XX @@ int net_init_bridge(const Netdev *netdev, const char *name,
517
}
518
s = net_tap_fd_init(peer, "bridge", name, fd, vnet_hdr);
519
520
+ /* Store startup parameters */
521
+ s->nc.stored_config = g_new0(NetdevInfo, 1);
522
+ s->nc.stored_config->type = NET_BACKEND_BRIDGE;
523
+ stored = &s->nc.stored_config->u.bridge;
524
+
525
+ if (br) {
526
+ stored->has_br = true;
527
+ stored->br = g_strdup(br);
528
+ }
529
+
530
+ if (helper) {
531
+ stored->has_helper = true;
532
+ stored->helper = g_strdup(helper);
533
+ }
534
+
535
snprintf(s->nc.info_str, sizeof(s->nc.info_str), "helper=%s,br=%s", helper,
536
br);
537
538
@@ -XXX,XX +XXX,XX @@ static void net_init_tap_one(const NetdevTapOptions *tap, NetClientState *peer,
539
const char *model, const char *name,
540
const char *ifname, const char *script,
541
const char *downscript, const char *vhostfdname,
542
- int vnet_hdr, int fd, Error **errp)
543
+ int vnet_hdr, int fd, NetdevInfo **common_stored,
544
+ Error **errp)
545
{
546
Error *err = NULL;
547
TAPState *s = net_tap_fd_init(peer, model, name, fd, vnet_hdr);
548
int vhostfd;
549
+ NetdevTapOptions *stored;
550
551
tap_set_sndbuf(s->fd, tap, &err);
552
if (err) {
553
@@ -XXX,XX +XXX,XX @@ static void net_init_tap_one(const NetdevTapOptions *tap, NetClientState *peer,
554
return;
555
}
556
557
+ /* Store startup parameters */
558
+ if (!*common_stored) {
559
+ *common_stored = g_new0(NetdevInfo, 1);
560
+ (*common_stored)->type = NET_BACKEND_TAP;
561
+ s->nc.stored_config = *common_stored;
562
+ }
563
+ stored = &(*common_stored)->u.tap;
564
+
565
+ if (tap->has_sndbuf && !stored->has_sndbuf) {
566
+ stored->has_sndbuf = true;
567
+ stored->sndbuf = tap->sndbuf;
568
+ }
569
+
570
+ if (vnet_hdr && !stored->has_vnet_hdr) {
571
+ stored->has_vnet_hdr = true;
572
+ stored->vnet_hdr = true;
573
+ }
574
+
575
if (tap->has_fd || tap->has_fds) {
576
+ if (!stored->has_fds) {
577
+ stored->has_fds = true;
578
+ stored->fds = g_strdup_printf("%d", fd);
579
+ } else {
580
+ char *tmp_s = stored->fds;
581
+ stored->fds = g_strdup_printf("%s:%d", stored->fds, fd);
582
+ g_free(tmp_s);
583
+ }
584
+
585
snprintf(s->nc.info_str, sizeof(s->nc.info_str), "fd=%d", fd);
586
} else if (tap->has_helper) {
587
+ if (!stored->has_helper) {
588
+ stored->has_helper = true;
589
+ stored->helper = g_strdup(tap->helper);
590
+ }
591
+
592
+ if (!stored->has_br) {
593
+ stored->has_br = true;
594
+ stored->br = tap->has_br ? g_strdup(tap->br) :
595
+ g_strdup(DEFAULT_BRIDGE_INTERFACE);
596
+ }
597
+
598
snprintf(s->nc.info_str, sizeof(s->nc.info_str), "helper=%s",
599
tap->helper);
600
} else {
601
+ if (ifname && !stored->has_ifname) {
602
+ stored->has_ifname = true;
603
+ stored->ifname = g_strdup(ifname);
604
+ }
605
+
606
+ if (script && !stored->has_script) {
607
+ stored->has_script = true;
608
+ stored->script = g_strdup(script);
609
+ }
610
+
611
+ if (downscript && !stored->has_downscript) {
612
+ stored->has_downscript = true;
613
+ stored->downscript = g_strdup(downscript);
614
+ }
615
+
616
snprintf(s->nc.info_str, sizeof(s->nc.info_str),
617
"ifname=%s,script=%s,downscript=%s", ifname, script,
618
downscript);
619
@@ -XXX,XX +XXX,XX @@ static void net_init_tap_one(const NetdevTapOptions *tap, NetClientState *peer,
620
vhostfdname || (tap->has_vhostforce && tap->vhostforce)) {
621
VhostNetOptions options;
622
623
+ stored->has_vhost = true;
624
+ stored->vhost = true;
625
+
626
+ if (tap->has_vhostforce && tap->vhostforce) {
627
+ stored->has_vhostforce = true;
628
+ stored->vhostforce = true;
629
+ }
630
+
631
options.backend_type = VHOST_BACKEND_TYPE_KERNEL;
632
options.net_backend = &s->nc;
633
if (tap->has_poll_us) {
634
+ stored->has_poll_us = true;
635
+ stored->poll_us = tap->poll_us;
636
+
637
options.busyloop_timeout = tap->poll_us;
638
} else {
639
options.busyloop_timeout = 0;
640
@@ -XXX,XX +XXX,XX @@ static void net_init_tap_one(const NetdevTapOptions *tap, NetClientState *peer,
641
}
642
options.opaque = (void *)(uintptr_t)vhostfd;
643
644
+ if (!stored->has_vhostfds) {
645
+ stored->has_vhostfds = true;
646
+ stored->vhostfds = g_strdup_printf("%d", vhostfd);
647
+ } else {
648
+ char *tmp_s = stored->vhostfds;
649
+ stored->vhostfds = g_strdup_printf("%s:%d", stored->fds, vhostfd);
650
+ g_free(tmp_s);
651
+ }
652
+
653
s->vhost_net = vhost_net_init(&options);
654
if (!s->vhost_net) {
655
if (tap->has_vhostforce && tap->vhostforce) {
656
@@ -XXX,XX +XXX,XX @@ int net_init_tap(const Netdev *netdev, const char *name,
657
const char *vhostfdname;
658
char ifname[128];
659
int ret = 0;
660
+ NetdevInfo *common_stored = NULL; /* will store configuration */
661
662
assert(netdev->type == NET_CLIENT_DRIVER_TAP);
663
tap = &netdev->u.tap;
664
@@ -XXX,XX +XXX,XX @@ int net_init_tap(const Netdev *netdev, const char *name,
665
666
net_init_tap_one(tap, peer, "tap", name, NULL,
667
script, downscript,
668
- vhostfdname, vnet_hdr, fd, &err);
669
+ vhostfdname, vnet_hdr, fd, &common_stored, &err);
670
if (err) {
671
error_propagate(errp, err);
672
close(fd);
673
@@ -XXX,XX +XXX,XX @@ int net_init_tap(const Netdev *netdev, const char *name,
674
net_init_tap_one(tap, peer, "tap", name, ifname,
675
script, downscript,
676
tap->has_vhostfds ? vhost_fds[i] : NULL,
677
- vnet_hdr, fd, &err);
678
+ vnet_hdr, fd, &common_stored, &err);
679
if (err) {
680
error_propagate(errp, err);
681
ret = -1;
682
@@ -XXX,XX +XXX,XX @@ free_fail:
683
684
net_init_tap_one(tap, peer, "bridge", name, ifname,
685
script, downscript, vhostfdname,
686
- vnet_hdr, fd, &err);
687
+ vnet_hdr, fd, &common_stored, &err);
688
if (err) {
689
error_propagate(errp, err);
690
close(fd);
691
@@ -XXX,XX +XXX,XX @@ free_fail:
692
net_init_tap_one(tap, peer, "tap", name, ifname,
693
i >= 1 ? "no" : script,
694
i >= 1 ? "no" : downscript,
695
- vhostfdname, vnet_hdr, fd, &err);
696
+ vhostfdname, vnet_hdr, fd,
697
+ &common_stored, &err);
698
if (err) {
699
error_propagate(errp, err);
700
close(fd);
701
diff --git a/net/vde.c b/net/vde.c
702
index XXXXXXX..XXXXXXX 100644
703
--- a/net/vde.c
704
+++ b/net/vde.c
705
@@ -XXX,XX +XXX,XX @@ static int net_vde_init(NetClientState *peer, const char *model,
706
VDECONN *vde;
707
char *init_group = (char *)group;
708
char *init_sock = (char *)sock;
709
+ NetdevVdeOptions *stored;
710
711
struct vde_open_args args = {
712
.port = port,
713
@@ -XXX,XX +XXX,XX @@ static int net_vde_init(NetClientState *peer, const char *model,
714
715
qemu_set_fd_handler(vde_datafd(s->vde), vde_to_qemu, NULL, s);
716
717
+ /* Store startup parameters */
718
+ nc->stored_config = g_new0(NetdevInfo, 1);
719
+ nc->stored_config->type = NET_BACKEND_VDE;
720
+ stored = &nc->stored_config->u.vde;
721
+
722
+ if (sock) {
723
+ stored->has_sock = true;
724
+ stored->sock = g_strdup(sock);
725
+ }
726
+
727
+ stored->has_port = true;
728
+ stored->port = port;
729
+
730
+ if (group) {
731
+ stored->has_group = true;
732
+ stored->group = g_strdup(group);
733
+ }
734
+
735
+ stored->has_mode = true;
736
+ stored->mode = mode;
737
+
738
return 0;
739
}
740
741
diff --git a/net/vhost-user.c b/net/vhost-user.c
742
index XXXXXXX..XXXXXXX 100644
743
--- a/net/vhost-user.c
744
+++ b/net/vhost-user.c
745
@@ -XXX,XX +XXX,XX @@ static void net_vhost_user_event(void *opaque, QEMUChrEvent event)
746
}
747
748
static int net_vhost_user_init(NetClientState *peer, const char *device,
749
- const char *name, Chardev *chr,
750
- int queues)
751
+ const char *name, const char *chardev,
752
+ Chardev *chr, int queues)
753
{
754
Error *err = NULL;
755
NetClientState *nc, *nc0 = NULL;
756
NetVhostUserState *s = NULL;
757
VhostUserState *user;
758
int i;
759
+ NetdevVhostUserOptions *stored;
760
761
assert(name);
762
assert(queues > 0);
763
@@ -XXX,XX +XXX,XX @@ static int net_vhost_user_init(NetClientState *peer, const char *device,
764
765
assert(s->vhost_net);
766
767
+ /* Store startup parameters */
768
+ nc0->stored_config = g_new0(NetdevInfo, 1);
769
+ nc0->stored_config->type = NET_BACKEND_VHOST_USER;
770
+ stored = &nc0->stored_config->u.vhost_user;
771
+
772
+ stored->chardev = g_strdup(chardev);
773
+
774
+ stored->has_queues = true;
775
+ stored->queues = queues;
776
+
777
return 0;
778
779
err:
780
@@ -XXX,XX +XXX,XX @@ int net_init_vhost_user(const Netdev *netdev, const char *name,
781
return -1;
782
}
783
784
- return net_vhost_user_init(peer, "vhost_user", name, chr, queues);
785
+ return net_vhost_user_init(peer, "vhost_user", name,
786
+ vhost_user_opts->chardev, chr, queues);
787
}
788
diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
789
index XXXXXXX..XXXXXXX 100644
790
--- a/net/vhost-vdpa.c
791
+++ b/net/vhost-vdpa.c
792
@@ -XXX,XX +XXX,XX @@ static int net_vhost_vdpa_init(NetClientState *peer, const char *device,
793
VhostVDPAState *s;
794
int vdpa_device_fd = -1;
795
int ret = 0;
796
+ NetdevVhostVDPAOptions *stored;
797
+
798
assert(name);
799
nc = qemu_new_net_client(&net_vhost_vdpa_info, peer, device, name);
800
+
801
+ /* Store startup parameters */
802
+ nc->stored_config = g_new0(NetdevInfo, 1);
803
+ nc->stored_config->type = NET_BACKEND_VHOST_VDPA;
804
+ stored = &nc->stored_config->u.vhost_vdpa;
805
+
806
+ stored->has_vhostdev = true;
807
+ stored->vhostdev = g_strdup(vhostdev);
808
+
809
+ stored->has_queues = true;
810
+ stored->queues = 1; /* TODO: change when support multiqueue */
811
+
812
snprintf(nc->info_str, sizeof(nc->info_str), TYPE_VHOST_VDPA);
813
nc->queue_index = 0;
814
s = DO_UPCAST(VhostVDPAState, nc, nc);
815
diff --git a/qapi/net.json b/qapi/net.json
816
index XXXXXXX..XXXXXXX 100644
817
--- a/qapi/net.json
818
+++ b/qapi/net.json
819
@@ -XXX,XX +XXX,XX @@
820
##
821
{ 'event': 'FAILOVER_NEGOTIATED',
822
'data': {'device-id': 'str'} }
823
+
824
+##
825
+# @NetBackend:
826
+#
827
+# Available netdev backend drivers.
828
+#
829
+# Since: 6.0
830
+##
831
+{ 'enum': 'NetBackend',
832
+ 'data': [ 'bridge', 'l2tpv3', 'netmap', 'socket', 'tap', 'user', 'vde',
833
+ 'vhost-user', 'vhost-vdpa' ] }
834
+
835
+##
836
+# @NetdevInfo:
837
+#
838
+# Configuration of a network backend device (netdev).
839
+#
840
+# @id: Device identifier.
841
+#
842
+# @type: Specify the driver used for interpreting remaining arguments.
843
+#
844
+# @peer-id: The connected frontend network device name (absent if no frontend
845
+# is connected).
846
+#
847
+# Since: 6.0
848
+##
849
+{ 'union': 'NetdevInfo',
850
+ 'base': { 'id': 'str',
851
+ 'type': 'NetBackend',
852
+ '*peer-id': 'str' },
853
+ 'discriminator': 'type',
854
+ 'data': {
855
+ 'bridge': 'NetdevBridgeOptions',
856
+ 'l2tpv3': 'NetdevL2TPv3Options',
857
+ 'netmap': 'NetdevNetmapOptions',
858
+ 'socket': 'NetdevSocketOptions',
859
+ 'tap': 'NetdevTapOptions',
860
+ 'user': 'NetdevUserOptions',
861
+ 'vde': 'NetdevVdeOptions',
862
+ 'vhost-user': 'NetdevVhostUserOptions',
863
+ 'vhost-vdpa': 'NetdevVhostVDPAOptions' } }
864
+
865
+##
866
+# @query-netdev:
867
+#
868
+# Get a list of @NetdevInfo for all virtual network backend devices (netdevs).
869
+#
870
+# Returns: a list of @NetdevInfo describing each netdev.
871
+#
872
+# Since: 6.0
873
+#
874
+# Example:
875
+#
876
+# -> { "execute": "query-netdev" }
877
+# <- { "return": [
878
+# {
879
+# "ipv6": true,
880
+# "ipv4": true,
881
+# "host": "10.0.2.2",
882
+# "ipv6-dns": "fec0::3",
883
+# "ipv6-prefix": "fec0::",
884
+# "net": "10.0.2.0/255.255.255.0",
885
+# "ipv6-host": "fec0::2",
886
+# "type": "user",
887
+# "peer-id": "net0",
888
+# "dns": "10.0.2.3",
889
+# "hostfwd": [
890
+# {
891
+# "str": "tcp::20004-:22"
892
+# }
893
+# ],
894
+# "ipv6-prefixlen": 64,
895
+# "id": "netdev0",
896
+# "restrict": false
897
+# }
898
+# ]
899
+# }
900
+#
901
+##
902
+{ 'command': 'query-netdev', 'returns': ['NetdevInfo'] }
903
--
125
--
904
2.7.4
126
2.7.4
905
127
906
128
diff view generated by jsdifflib
1
From: Alexander Bulekov <alxndr@bu.edu>
1
From: Eugenio Pérez <eperezma@redhat.com>
2
2
3
This patch switches to use qemu_receive_packet() which can detect
3
It reports the shadow virtqueue address from qemu virtual address space.
4
reentrancy and return early.
5
4
6
This is intended to address CVE-2021-3416.
5
Since this will be different from the guest's vaddr, but the device can
6
access it, SVQ takes special care about its alignment & lack of garbage
7
data. It assumes that IOMMU will work in host_page_size ranges for that.
7
8
8
Cc: Prasad J Pandit <ppandit@redhat.com>
9
Acked-by: Michael S. Tsirkin <mst@redhat.com>
9
Cc: qemu-stable@nongnu.org
10
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
10
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
11
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
12
Signed-off-by: Jason Wang <jasowang@redhat.com>
11
Signed-off-by: Jason Wang <jasowang@redhat.com>
13
---
12
---
14
hw/net/cadence_gem.c | 4 ++--
13
hw/virtio/vhost-shadow-virtqueue.c | 29 +++++++++++++++++++++++++++++
15
1 file changed, 2 insertions(+), 2 deletions(-)
14
hw/virtio/vhost-shadow-virtqueue.h | 9 +++++++++
15
2 files changed, 38 insertions(+)
16
16
17
diff --git a/hw/net/cadence_gem.c b/hw/net/cadence_gem.c
17
diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
18
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
19
--- a/hw/net/cadence_gem.c
19
--- a/hw/virtio/vhost-shadow-virtqueue.c
20
+++ b/hw/net/cadence_gem.c
20
+++ b/hw/virtio/vhost-shadow-virtqueue.c
21
@@ -XXX,XX +XXX,XX @@ static void gem_transmit(CadenceGEMState *s)
21
@@ -XXX,XX +XXX,XX @@ void vhost_svq_set_svq_call_fd(VhostShadowVirtqueue *svq, int call_fd)
22
/* Send the packet somewhere */
22
}
23
if (s->phy_loop || (s->regs[GEM_NWCTRL] &
23
24
GEM_NWCTRL_LOCALLOOP)) {
24
/**
25
- gem_receive(qemu_get_queue(s->nic), s->tx_packet,
25
+ * Get the shadow vq vring address.
26
- total_bytes);
26
+ * @svq: Shadow virtqueue
27
+ qemu_receive_packet(qemu_get_queue(s->nic), s->tx_packet,
27
+ * @addr: Destination to store address
28
+ total_bytes);
28
+ */
29
} else {
29
+void vhost_svq_get_vring_addr(const VhostShadowVirtqueue *svq,
30
qemu_send_packet(qemu_get_queue(s->nic), s->tx_packet,
30
+ struct vhost_vring_addr *addr)
31
total_bytes);
31
+{
32
+ addr->desc_user_addr = (uint64_t)svq->vring.desc;
33
+ addr->avail_user_addr = (uint64_t)svq->vring.avail;
34
+ addr->used_user_addr = (uint64_t)svq->vring.used;
35
+}
36
+
37
+size_t vhost_svq_driver_area_size(const VhostShadowVirtqueue *svq)
38
+{
39
+ size_t desc_size = sizeof(vring_desc_t) * svq->vring.num;
40
+ size_t avail_size = offsetof(vring_avail_t, ring) +
41
+ sizeof(uint16_t) * svq->vring.num;
42
+
43
+ return ROUND_UP(desc_size + avail_size, qemu_real_host_page_size);
44
+}
45
+
46
+size_t vhost_svq_device_area_size(const VhostShadowVirtqueue *svq)
47
+{
48
+ size_t used_size = offsetof(vring_used_t, ring) +
49
+ sizeof(vring_used_elem_t) * svq->vring.num;
50
+ return ROUND_UP(used_size, qemu_real_host_page_size);
51
+}
52
+
53
+/**
54
* Set a new file descriptor for the guest to kick the SVQ and notify for avail
55
*
56
* @svq: The svq
57
diff --git a/hw/virtio/vhost-shadow-virtqueue.h b/hw/virtio/vhost-shadow-virtqueue.h
58
index XXXXXXX..XXXXXXX 100644
59
--- a/hw/virtio/vhost-shadow-virtqueue.h
60
+++ b/hw/virtio/vhost-shadow-virtqueue.h
61
@@ -XXX,XX +XXX,XX @@
62
#define VHOST_SHADOW_VIRTQUEUE_H
63
64
#include "qemu/event_notifier.h"
65
+#include "hw/virtio/virtio.h"
66
+#include "standard-headers/linux/vhost_types.h"
67
68
/* Shadow virtqueue to relay notifications */
69
typedef struct VhostShadowVirtqueue {
70
+ /* Shadow vring */
71
+ struct vring vring;
72
+
73
/* Shadow kick notifier, sent to vhost */
74
EventNotifier hdev_kick;
75
/* Shadow call notifier, sent to vhost */
76
@@ -XXX,XX +XXX,XX @@ bool vhost_svq_valid_features(uint64_t features, Error **errp);
77
78
void vhost_svq_set_svq_kick_fd(VhostShadowVirtqueue *svq, int svq_kick_fd);
79
void vhost_svq_set_svq_call_fd(VhostShadowVirtqueue *svq, int call_fd);
80
+void vhost_svq_get_vring_addr(const VhostShadowVirtqueue *svq,
81
+ struct vhost_vring_addr *addr);
82
+size_t vhost_svq_driver_area_size(const VhostShadowVirtqueue *svq);
83
+size_t vhost_svq_device_area_size(const VhostShadowVirtqueue *svq);
84
85
void vhost_svq_stop(VhostShadowVirtqueue *svq);
86
32
--
87
--
33
2.7.4
88
2.7.4
34
89
35
90
diff view generated by jsdifflib
1
From: Alexey Kirillov <lekiravi@yandex-team.ru>
1
From: Eugenio Pérez <eperezma@redhat.com>
2
2
3
The info_str field of the NetClientState structure is static and has a size
3
First half of the buffers forwarding part, preparing vhost-vdpa
4
of 256 bytes. This amount is often unclaimed, and the field itself is used
4
callbacks to SVQ to offer it. QEMU cannot enable it at this moment, so
5
exclusively for HMP "info network".
5
this is effectively dead code at the moment, but it helps to reduce
6
patch size.
6
7
7
The patch translates info_str to dynamic memory allocation.
8
Acked-by: Michael S. Tsirkin <mst@redhat.com>
8
9
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
9
This action is also allows us to painlessly discard usage of this field
10
for backend devices.
11
12
Signed-off-by: Alexey Kirillov <lekiravi@yandex-team.ru>
13
Signed-off-by: Jason Wang <jasowang@redhat.com>
10
Signed-off-by: Jason Wang <jasowang@redhat.com>
14
---
11
---
15
hw/net/xen_nic.c | 5 ++---
12
hw/virtio/vhost-vdpa.c | 48 +++++++++++++++++++++++++++++++++++++++++-------
16
include/net/net.h | 2 +-
13
1 file changed, 41 insertions(+), 7 deletions(-)
17
net/l2tpv3.c | 3 +--
18
net/net.c | 14 ++++++++------
19
net/slirp.c | 5 ++---
20
net/socket.c | 43 ++++++++++++++++++++++++-------------------
21
net/tap-win32.c | 3 +--
22
net/tap.c | 13 +++++--------
23
net/vde.c | 3 +--
24
net/vhost-user.c | 3 +--
25
net/vhost-vdpa.c | 2 +-
26
11 files changed, 47 insertions(+), 49 deletions(-)
27
14
28
diff --git a/hw/net/xen_nic.c b/hw/net/xen_nic.c
15
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
29
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
30
--- a/hw/net/xen_nic.c
17
--- a/hw/virtio/vhost-vdpa.c
31
+++ b/hw/net/xen_nic.c
18
+++ b/hw/virtio/vhost-vdpa.c
32
@@ -XXX,XX +XXX,XX @@ static int net_init(struct XenLegacyDevice *xendev)
19
@@ -XXX,XX +XXX,XX @@ static int vhost_vdpa_get_config(struct vhost_dev *dev, uint8_t *config,
33
netdev->nic = qemu_new_nic(&net_xen_info, &netdev->conf,
20
return ret;
34
"xen", NULL, netdev);
21
}
35
22
36
- snprintf(qemu_get_queue(netdev->nic)->info_str,
23
+static int vhost_vdpa_set_dev_vring_base(struct vhost_dev *dev,
37
- sizeof(qemu_get_queue(netdev->nic)->info_str),
24
+ struct vhost_vring_state *ring)
38
- "nic: xenbus vif macaddr=%s", netdev->mac);
25
+{
39
+ qemu_get_queue(netdev->nic)->info_str = g_strdup_printf(
26
+ trace_vhost_vdpa_set_vring_base(dev, ring->index, ring->num);
40
+ "nic: xenbus vif macaddr=%s", netdev->mac);
27
+ return vhost_vdpa_call(dev, VHOST_SET_VRING_BASE, ring);
41
28
+}
42
/* fill info */
29
+
43
xenstore_write_be_int(&netdev->xendev, "feature-rx-copy", 1);
30
static int vhost_vdpa_set_vring_dev_kick(struct vhost_dev *dev,
44
diff --git a/include/net/net.h b/include/net/net.h
31
struct vhost_vring_file *file)
45
index XXXXXXX..XXXXXXX 100644
46
--- a/include/net/net.h
47
+++ b/include/net/net.h
48
@@ -XXX,XX +XXX,XX @@ struct NetClientState {
49
NetQueue *incoming_queue;
50
char *model;
51
char *name;
52
- char info_str[256];
53
+ char *info_str;
54
NetdevInfo *stored_config;
55
unsigned receive_disabled : 1;
56
NetClientDestructor *destructor;
57
diff --git a/net/l2tpv3.c b/net/l2tpv3.c
58
index XXXXXXX..XXXXXXX 100644
59
--- a/net/l2tpv3.c
60
+++ b/net/l2tpv3.c
61
@@ -XXX,XX +XXX,XX @@ int net_init_l2tpv3(const Netdev *netdev,
62
QAPI_CLONE_MEMBERS(NetdevL2TPv3Options,
63
&nc->stored_config->u.l2tpv3, l2tpv3);
64
65
- snprintf(s->nc.info_str, sizeof(s->nc.info_str),
66
- "l2tpv3: connected");
67
+ s->nc.info_str = g_strdup_printf("l2tpv3: connected");
68
return 0;
69
outerr:
70
qemu_del_net_client(nc);
71
diff --git a/net/net.c b/net/net.c
72
index XXXXXXX..XXXXXXX 100644
73
--- a/net/net.c
74
+++ b/net/net.c
75
@@ -XXX,XX +XXX,XX @@ char *qemu_mac_strdup_printf(const uint8_t *macaddr)
76
77
void qemu_format_nic_info_str(NetClientState *nc, uint8_t macaddr[6])
78
{
32
{
79
- snprintf(nc->info_str, sizeof(nc->info_str),
33
@@ -XXX,XX +XXX,XX @@ static int vhost_vdpa_set_vring_dev_call(struct vhost_dev *dev,
80
- "model=%s,macaddr=%02x:%02x:%02x:%02x:%02x:%02x",
34
return vhost_vdpa_call(dev, VHOST_SET_VRING_CALL, file);
81
- nc->model,
82
- macaddr[0], macaddr[1], macaddr[2],
83
- macaddr[3], macaddr[4], macaddr[5]);
84
+ g_free(nc->info_str);
85
+ nc->info_str = g_strdup_printf(
86
+ "model=%s,macaddr=%02x:%02x:%02x:%02x:%02x:%02x",
87
+ nc->model,
88
+ macaddr[0], macaddr[1], macaddr[2],
89
+ macaddr[3], macaddr[4], macaddr[5]);
90
}
35
}
91
36
92
static int mac_table[256] = {0};
37
+static int vhost_vdpa_set_vring_dev_addr(struct vhost_dev *dev,
93
@@ -XXX,XX +XXX,XX @@ static void qemu_free_net_client(NetClientState *nc)
38
+ struct vhost_vring_addr *addr)
94
}
39
+{
95
g_free(nc->name);
40
+ trace_vhost_vdpa_set_vring_addr(dev, addr->index, addr->flags,
96
g_free(nc->model);
41
+ addr->desc_user_addr, addr->used_user_addr,
97
+ g_free(nc->info_str);
42
+ addr->avail_user_addr,
98
qapi_free_NetdevInfo(nc->stored_config);
43
+ addr->log_guest_addr);
99
if (nc->destructor) {
44
+
100
nc->destructor(nc);
45
+ return vhost_vdpa_call(dev, VHOST_SET_VRING_ADDR, addr);
101
@@ -XXX,XX +XXX,XX @@ void print_net_client(Monitor *mon, NetClientState *nc)
46
+
102
monitor_printf(mon, "%s: index=%d,type=%s,%s\n", nc->name,
47
+}
103
nc->queue_index,
48
+
104
NetClientDriver_str(nc->info->type),
49
/**
105
- nc->info_str);
50
* Set the shadow virtqueue descriptors to the device
106
+ nc->info_str ? nc->info_str : "");
51
*
107
if (!QTAILQ_EMPTY(&nc->filters)) {
52
@@ -XXX,XX +XXX,XX @@ static int vhost_vdpa_set_log_base(struct vhost_dev *dev, uint64_t base,
108
monitor_printf(mon, "filters:\n");
53
static int vhost_vdpa_set_vring_addr(struct vhost_dev *dev,
109
}
54
struct vhost_vring_addr *addr)
110
diff --git a/net/slirp.c b/net/slirp.c
55
{
111
index XXXXXXX..XXXXXXX 100644
56
- trace_vhost_vdpa_set_vring_addr(dev, addr->index, addr->flags,
112
--- a/net/slirp.c
57
- addr->desc_user_addr, addr->used_user_addr,
113
+++ b/net/slirp.c
58
- addr->avail_user_addr,
114
@@ -XXX,XX +XXX,XX @@ static int net_slirp_init(NetClientState *peer, const char *model,
59
- addr->log_guest_addr);
115
stored->tftp_server_name = g_strdup(tftp_server_name);
60
- return vhost_vdpa_call(dev, VHOST_SET_VRING_ADDR, addr);
116
}
61
+ struct vhost_vdpa *v = dev->opaque;
117
62
+
118
- snprintf(nc->info_str, sizeof(nc->info_str),
63
+ if (v->shadow_vqs_enabled) {
119
- "net=%s,restrict=%s", inet_ntoa(net),
64
+ /*
120
- restricted ? "on" : "off");
65
+ * Device vring addr was set at device start. SVQ base is handled by
121
+ nc->info_str = g_strdup_printf("net=%s,restrict=%s", inet_ntoa(net),
66
+ * VirtQueue code.
122
+ restricted ? "on" : "off");
67
+ */
123
68
+ return 0;
124
s = DO_UPCAST(SlirpState, nc, nc);
69
+ }
125
70
+
126
diff --git a/net/socket.c b/net/socket.c
71
+ return vhost_vdpa_set_vring_dev_addr(dev, addr);
127
index XXXXXXX..XXXXXXX 100644
128
--- a/net/socket.c
129
+++ b/net/socket.c
130
@@ -XXX,XX +XXX,XX @@ static void net_socket_send(void *opaque)
131
s->fd = -1;
132
net_socket_rs_init(&s->rs, net_socket_rs_finalize, false);
133
s->nc.link_down = true;
134
- memset(s->nc.info_str, 0, sizeof(s->nc.info_str));
135
+ g_free(s->nc.info_str);
136
+ s->nc.info_str = g_new0(char, 1);
137
138
return;
139
}
140
@@ -XXX,XX +XXX,XX @@ static NetSocketState *net_socket_fd_init_dgram(NetClientState *peer,
141
stored->mcast = g_strdup(mcast);
142
143
s->dgram_dst = saddr;
144
- snprintf(nc->info_str, sizeof(nc->info_str),
145
- "socket: fd=%d (cloned mcast=%s:%d)",
146
- fd, inet_ntoa(saddr.sin_addr), ntohs(saddr.sin_port));
147
+ nc->info_str = g_strdup_printf("socket: fd=%d (cloned mcast=%s:%d)",
148
+ fd, inet_ntoa(saddr.sin_addr),
149
+ ntohs(saddr.sin_port));
150
} else {
151
if (sa_type == SOCKET_ADDRESS_TYPE_UNIX) {
152
s->dgram_dst.sin_family = AF_UNIX;
153
}
154
155
- snprintf(nc->info_str, sizeof(nc->info_str),
156
- "socket: fd=%d %s", fd, SocketAddressType_str(sa_type));
157
+ nc->info_str = g_strdup_printf("socket: fd=%d %s",
158
+ fd, SocketAddressType_str(sa_type));
159
}
160
161
return s;
162
@@ -XXX,XX +XXX,XX @@ static NetSocketState *net_socket_fd_init_stream(NetClientState *peer,
163
164
nc = qemu_new_net_client(&net_socket_info, peer, model, name);
165
166
- snprintf(nc->info_str, sizeof(nc->info_str), "socket: fd=%d", fd);
167
+ nc->info_str = g_strdup_printf("socket: fd=%d", fd);
168
169
s = DO_UPCAST(NetSocketState, nc, nc);
170
171
@@ -XXX,XX +XXX,XX @@ static void net_socket_accept(void *opaque)
172
stored->has_fd = true;
173
stored->fd = g_strdup_printf("%d", fd);
174
175
- snprintf(s->nc.info_str, sizeof(s->nc.info_str),
176
- "socket: connection from %s:%d",
177
- inet_ntoa(saddr.sin_addr), ntohs(saddr.sin_port));
178
+ g_free(s->nc.info_str);
179
+ s->nc.info_str = g_strdup_printf("socket: connection from %s:%d",
180
+ inet_ntoa(saddr.sin_addr),
181
+ ntohs(saddr.sin_port));
182
}
72
}
183
73
184
static int net_socket_listen_init(NetClientState *peer,
74
static int vhost_vdpa_set_vring_num(struct vhost_dev *dev,
185
@@ -XXX,XX +XXX,XX @@ static int net_socket_connect_init(NetClientState *peer,
75
@@ -XXX,XX +XXX,XX @@ static int vhost_vdpa_set_vring_num(struct vhost_dev *dev,
186
stored->has_connect = true;
76
static int vhost_vdpa_set_vring_base(struct vhost_dev *dev,
187
stored->connect = g_strdup(host_str);
77
struct vhost_vring_state *ring)
188
78
{
189
- snprintf(s->nc.info_str, sizeof(s->nc.info_str),
79
- trace_vhost_vdpa_set_vring_base(dev, ring->index, ring->num);
190
- "socket: connect to %s:%d",
80
- return vhost_vdpa_call(dev, VHOST_SET_VRING_BASE, ring);
191
- inet_ntoa(saddr.sin_addr), ntohs(saddr.sin_port));
81
+ struct vhost_vdpa *v = dev->opaque;
192
+ g_free(s->nc.info_str);
82
+
193
+ s->nc.info_str = g_strdup_printf("socket: connect to %s:%d",
83
+ if (v->shadow_vqs_enabled) {
194
+ inet_ntoa(saddr.sin_addr),
84
+ /*
195
+ ntohs(saddr.sin_port));
85
+ * Device vring base was set at device start. SVQ base is handled by
196
return 0;
86
+ * VirtQueue code.
87
+ */
88
+ return 0;
89
+ }
90
+
91
+ return vhost_vdpa_set_dev_vring_base(dev, ring);
197
}
92
}
198
93
199
@@ -XXX,XX +XXX,XX @@ static int net_socket_mcast_init(NetClientState *peer,
94
static int vhost_vdpa_get_vring_base(struct vhost_dev *dev,
200
stored->localaddr = g_strdup(localaddr_str);
201
}
202
203
- snprintf(s->nc.info_str, sizeof(s->nc.info_str),
204
- "socket: mcast=%s:%d",
205
- inet_ntoa(saddr.sin_addr), ntohs(saddr.sin_port));
206
+ g_free(s->nc.info_str);
207
+ s->nc.info_str = g_strdup_printf("socket: mcast=%s:%d",
208
+ inet_ntoa(saddr.sin_addr),
209
+ ntohs(saddr.sin_port));
210
return 0;
211
212
}
213
@@ -XXX,XX +XXX,XX @@ static int net_socket_udp_init(NetClientState *peer,
214
stored->has_udp = true;
215
stored->udp = g_strdup(rhost);
216
217
- snprintf(s->nc.info_str, sizeof(s->nc.info_str),
218
- "socket: udp=%s:%d",
219
- inet_ntoa(raddr.sin_addr), ntohs(raddr.sin_port));
220
+ g_free(s->nc.info_str);
221
+ s->nc.info_str = g_strdup_printf("socket: udp=%s:%d",
222
+ inet_ntoa(raddr.sin_addr),
223
+ ntohs(raddr.sin_port));
224
return 0;
225
}
226
227
diff --git a/net/tap-win32.c b/net/tap-win32.c
228
index XXXXXXX..XXXXXXX 100644
229
--- a/net/tap-win32.c
230
+++ b/net/tap-win32.c
231
@@ -XXX,XX +XXX,XX @@ static int tap_win32_init(NetClientState *peer, const char *model,
232
stored->has_ifname = true;
233
stored->ifname = g_strdup(ifname);
234
235
- snprintf(s->nc.info_str, sizeof(s->nc.info_str),
236
- "tap: ifname=%s", ifname);
237
+ s->nc.info_str = g_strdup_printf("tap: ifname=%s", ifname);
238
239
s->handle = handle;
240
241
diff --git a/net/tap.c b/net/tap.c
242
index XXXXXXX..XXXXXXX 100644
243
--- a/net/tap.c
244
+++ b/net/tap.c
245
@@ -XXX,XX +XXX,XX @@ int net_init_bridge(const Netdev *netdev, const char *name,
246
stored->helper = g_strdup(helper);
247
}
248
249
- snprintf(s->nc.info_str, sizeof(s->nc.info_str), "helper=%s,br=%s", helper,
250
- br);
251
+ s->nc.info_str = g_strdup_printf("helper=%s,br=%s", helper, br);
252
253
return 0;
254
}
255
@@ -XXX,XX +XXX,XX @@ static void net_init_tap_one(const NetdevTapOptions *tap, NetClientState *peer,
256
g_free(tmp_s);
257
}
258
259
- snprintf(s->nc.info_str, sizeof(s->nc.info_str), "fd=%d", fd);
260
+ s->nc.info_str = g_strdup_printf("fd=%d", fd);
261
} else if (tap->has_helper) {
262
if (!stored->has_helper) {
263
stored->has_helper = true;
264
@@ -XXX,XX +XXX,XX @@ static void net_init_tap_one(const NetdevTapOptions *tap, NetClientState *peer,
265
g_strdup(DEFAULT_BRIDGE_INTERFACE);
266
}
267
268
- snprintf(s->nc.info_str, sizeof(s->nc.info_str), "helper=%s",
269
- tap->helper);
270
+ s->nc.info_str = g_strdup_printf("helper=%s", tap->helper);
271
} else {
272
if (ifname && !stored->has_ifname) {
273
stored->has_ifname = true;
274
@@ -XXX,XX +XXX,XX @@ static void net_init_tap_one(const NetdevTapOptions *tap, NetClientState *peer,
275
stored->downscript = g_strdup(downscript);
276
}
277
278
- snprintf(s->nc.info_str, sizeof(s->nc.info_str),
279
- "ifname=%s,script=%s,downscript=%s", ifname, script,
280
- downscript);
281
+ s->nc.info_str = g_strdup_printf("ifname=%s,script=%s,downscript=%s",
282
+ ifname, script, downscript);
283
284
if (strcmp(downscript, "no") != 0) {
285
snprintf(s->down_script, sizeof(s->down_script), "%s", downscript);
286
diff --git a/net/vde.c b/net/vde.c
287
index XXXXXXX..XXXXXXX 100644
288
--- a/net/vde.c
289
+++ b/net/vde.c
290
@@ -XXX,XX +XXX,XX @@ static int net_vde_init(NetClientState *peer, const char *model,
291
292
nc = qemu_new_net_client(&net_vde_info, peer, model, name);
293
294
- snprintf(nc->info_str, sizeof(nc->info_str), "sock=%s,fd=%d",
295
- sock, vde_datafd(vde));
296
+ nc->info_str = g_strdup_printf("sock=%s,fd=%d", sock, vde_datafd(vde));
297
298
s = DO_UPCAST(VDEState, nc, nc);
299
300
diff --git a/net/vhost-user.c b/net/vhost-user.c
301
index XXXXXXX..XXXXXXX 100644
302
--- a/net/vhost-user.c
303
+++ b/net/vhost-user.c
304
@@ -XXX,XX +XXX,XX @@ static int net_vhost_user_init(NetClientState *peer, const char *device,
305
user = g_new0(struct VhostUserState, 1);
306
for (i = 0; i < queues; i++) {
307
nc = qemu_new_net_client(&net_vhost_user_info, peer, device, name);
308
- snprintf(nc->info_str, sizeof(nc->info_str), "vhost-user%d to %s",
309
- i, chr->label);
310
+ nc->info_str = g_strdup_printf("vhost-user%d to %s", i, chr->label);
311
nc->queue_index = i;
312
if (!nc0) {
313
nc0 = nc;
314
diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
315
index XXXXXXX..XXXXXXX 100644
316
--- a/net/vhost-vdpa.c
317
+++ b/net/vhost-vdpa.c
318
@@ -XXX,XX +XXX,XX @@ static int net_vhost_vdpa_init(NetClientState *peer, const char *device,
319
stored->has_queues = true;
320
stored->queues = 1; /* TODO: change when support multiqueue */
321
322
- snprintf(nc->info_str, sizeof(nc->info_str), TYPE_VHOST_VDPA);
323
+ nc->info_str = g_strdup_printf(TYPE_VHOST_VDPA);
324
nc->queue_index = 0;
325
s = DO_UPCAST(VhostVDPAState, nc, nc);
326
vdpa_device_fd = qemu_open_old(vhostdev, O_RDWR);
327
--
95
--
328
2.7.4
96
2.7.4
329
97
330
98
diff view generated by jsdifflib
1
From: Cornelia Huck <cohuck@redhat.com>
1
From: Eugenio Pérez <eperezma@redhat.com>
2
2
3
The pvrdma code relies on the pvrdma_ring.h kernel header for some
3
Initial version of shadow virtqueue that actually forward buffers. There
4
basic ring buffer handling. The content of that header isn't very
4
is no iommu support at the moment, and that will be addressed in future
5
exciting, but contains some (q)atomic_*() invocations that (a)
5
patches of this series. Since all vhost-vdpa devices use forced IOMMU,
6
cause manual massaging when doing a headers update, and (b) are
6
this means that SVQ is not usable at this point of the series on any
7
an indication that we probably should not be importing that header
7
device.
8
at all.
8
9
9
For simplicity it only supports modern devices, that expects vring
10
Let's reimplement the ring buffer handling directly in the pvrdma
10
in little endian, with split ring and no event idx or indirect
11
code instead. This arguably also improves readability of the code.
11
descriptors. Support for them will not be added in this series.
12
12
13
Importing the header can now be dropped.
13
It reuses the VirtQueue code for the device part. The driver part is
14
14
based on Linux's virtio_ring driver, but with stripped functionality
15
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
15
and optimizations so it's easier to review.
16
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
16
17
Reviewed-by: Yuval Shaia <yuval.shaia.ml@gmail.com>
17
However, forwarding buffers have some particular pieces: One of the most
18
Tested-by: Yuval Shaia <yuval.shaia.ml@gmail.com>
18
unexpected ones is that a guest's buffer can expand through more than
19
one descriptor in SVQ. While this is handled gracefully by qemu's
20
emulated virtio devices, it may cause unexpected SVQ queue full. This
21
patch also solves it by checking for this condition at both guest's
22
kicks and device's calls. The code may be more elegant in the future if
23
SVQ code runs in its own iocontext.
24
25
Acked-by: Michael S. Tsirkin <mst@redhat.com>
26
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
19
Signed-off-by: Jason Wang <jasowang@redhat.com>
27
Signed-off-by: Jason Wang <jasowang@redhat.com>
20
---
28
---
21
hw/rdma/vmw/pvrdma.h | 5 +-
29
hw/virtio/vhost-shadow-virtqueue.c | 354 ++++++++++++++++++++++++++++++++++++-
22
hw/rdma/vmw/pvrdma_cmd.c | 6 +-
30
hw/virtio/vhost-shadow-virtqueue.h | 26 +++
23
hw/rdma/vmw/pvrdma_dev_ring.c | 41 ++++----
31
hw/virtio/vhost-vdpa.c | 159 ++++++++++++++++-
24
hw/rdma/vmw/pvrdma_dev_ring.h | 9 +-
32
3 files changed, 527 insertions(+), 12 deletions(-)
25
hw/rdma/vmw/pvrdma_main.c | 4 +-
33
26
.../drivers/infiniband/hw/vmw_pvrdma/pvrdma_ring.h | 114 ---------------------
34
diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
27
scripts/update-linux-headers.sh | 3 +-
28
7 files changed, 38 insertions(+), 144 deletions(-)
29
delete mode 100644 include/standard-headers/drivers/infiniband/hw/vmw_pvrdma/pvrdma_ring.h
30
31
diff --git a/hw/rdma/vmw/pvrdma.h b/hw/rdma/vmw/pvrdma.h
32
index XXXXXXX..XXXXXXX 100644
35
index XXXXXXX..XXXXXXX 100644
33
--- a/hw/rdma/vmw/pvrdma.h
36
--- a/hw/virtio/vhost-shadow-virtqueue.c
34
+++ b/hw/rdma/vmw/pvrdma.h
37
+++ b/hw/virtio/vhost-shadow-virtqueue.c
35
@@ -XXX,XX +XXX,XX @@
38
@@ -XXX,XX +XXX,XX @@
36
#include "../rdma_backend_defs.h"
39
#include "qemu/error-report.h"
37
#include "../rdma_rm_defs.h"
40
#include "qapi/error.h"
38
41
#include "qemu/main-loop.h"
39
-#include "standard-headers/drivers/infiniband/hw/vmw_pvrdma/pvrdma_ring.h"
42
+#include "qemu/log.h"
40
#include "standard-headers/drivers/infiniband/hw/vmw_pvrdma/pvrdma_dev_api.h"
43
+#include "qemu/memalign.h"
41
#include "pvrdma_dev_ring.h"
44
#include "linux-headers/linux/vhost.h"
42
#include "qom/object.h"
45
43
@@ -XXX,XX +XXX,XX @@ typedef struct DSRInfo {
46
/**
44
union pvrdma_cmd_req *req;
47
@@ -XXX,XX +XXX,XX @@ bool vhost_svq_valid_features(uint64_t features, Error **errp)
45
union pvrdma_cmd_resp *rsp;
48
}
46
49
47
- struct pvrdma_ring *async_ring_state;
50
/**
48
+ PvrdmaRingState *async_ring_state;
51
- * Forward guest notifications.
49
PvrdmaRing async;
52
+ * Number of descriptors that the SVQ can make available from the guest.
50
53
+ *
51
- struct pvrdma_ring *cq_ring_state;
54
+ * @svq: The svq
52
+ PvrdmaRingState *cq_ring_state;
55
+ */
53
PvrdmaRing cq;
56
+static uint16_t vhost_svq_available_slots(const VhostShadowVirtqueue *svq)
54
} DSRInfo;
57
+{
55
58
+ return svq->vring.num - (svq->shadow_avail_idx - svq->shadow_used_idx);
56
diff --git a/hw/rdma/vmw/pvrdma_cmd.c b/hw/rdma/vmw/pvrdma_cmd.c
59
+}
60
+
61
+static void vhost_vring_write_descs(VhostShadowVirtqueue *svq,
62
+ const struct iovec *iovec,
63
+ size_t num, bool more_descs, bool write)
64
+{
65
+ uint16_t i = svq->free_head, last = svq->free_head;
66
+ unsigned n;
67
+ uint16_t flags = write ? cpu_to_le16(VRING_DESC_F_WRITE) : 0;
68
+ vring_desc_t *descs = svq->vring.desc;
69
+
70
+ if (num == 0) {
71
+ return;
72
+ }
73
+
74
+ for (n = 0; n < num; n++) {
75
+ if (more_descs || (n + 1 < num)) {
76
+ descs[i].flags = flags | cpu_to_le16(VRING_DESC_F_NEXT);
77
+ } else {
78
+ descs[i].flags = flags;
79
+ }
80
+ descs[i].addr = cpu_to_le64((hwaddr)iovec[n].iov_base);
81
+ descs[i].len = cpu_to_le32(iovec[n].iov_len);
82
+
83
+ last = i;
84
+ i = cpu_to_le16(descs[i].next);
85
+ }
86
+
87
+ svq->free_head = le16_to_cpu(descs[last].next);
88
+}
89
+
90
+static bool vhost_svq_add_split(VhostShadowVirtqueue *svq,
91
+ VirtQueueElement *elem,
92
+ unsigned *head)
93
+{
94
+ unsigned avail_idx;
95
+ vring_avail_t *avail = svq->vring.avail;
96
+
97
+ *head = svq->free_head;
98
+
99
+ /* We need some descriptors here */
100
+ if (unlikely(!elem->out_num && !elem->in_num)) {
101
+ qemu_log_mask(LOG_GUEST_ERROR,
102
+ "Guest provided element with no descriptors");
103
+ return false;
104
+ }
105
+
106
+ vhost_vring_write_descs(svq, elem->out_sg, elem->out_num,
107
+ elem->in_num > 0, false);
108
+ vhost_vring_write_descs(svq, elem->in_sg, elem->in_num, false, true);
109
+
110
+ /*
111
+ * Put the entry in the available array (but don't update avail->idx until
112
+ * they do sync).
113
+ */
114
+ avail_idx = svq->shadow_avail_idx & (svq->vring.num - 1);
115
+ avail->ring[avail_idx] = cpu_to_le16(*head);
116
+ svq->shadow_avail_idx++;
117
+
118
+ /* Update the avail index after write the descriptor */
119
+ smp_wmb();
120
+ avail->idx = cpu_to_le16(svq->shadow_avail_idx);
121
+
122
+ return true;
123
+}
124
+
125
+static bool vhost_svq_add(VhostShadowVirtqueue *svq, VirtQueueElement *elem)
126
+{
127
+ unsigned qemu_head;
128
+ bool ok = vhost_svq_add_split(svq, elem, &qemu_head);
129
+ if (unlikely(!ok)) {
130
+ return false;
131
+ }
132
+
133
+ svq->ring_id_maps[qemu_head] = elem;
134
+ return true;
135
+}
136
+
137
+static void vhost_svq_kick(VhostShadowVirtqueue *svq)
138
+{
139
+ /*
140
+ * We need to expose the available array entries before checking the used
141
+ * flags
142
+ */
143
+ smp_mb();
144
+ if (svq->vring.used->flags & VRING_USED_F_NO_NOTIFY) {
145
+ return;
146
+ }
147
+
148
+ event_notifier_set(&svq->hdev_kick);
149
+}
150
+
151
+/**
152
+ * Forward available buffers.
153
+ *
154
+ * @svq: Shadow VirtQueue
155
+ *
156
+ * Note that this function does not guarantee that all guest's available
157
+ * buffers are available to the device in SVQ avail ring. The guest may have
158
+ * exposed a GPA / GIOVA contiguous buffer, but it may not be contiguous in
159
+ * qemu vaddr.
160
+ *
161
+ * If that happens, guest's kick notifications will be disabled until the
162
+ * device uses some buffers.
163
+ */
164
+static void vhost_handle_guest_kick(VhostShadowVirtqueue *svq)
165
+{
166
+ /* Clear event notifier */
167
+ event_notifier_test_and_clear(&svq->svq_kick);
168
+
169
+ /* Forward to the device as many available buffers as possible */
170
+ do {
171
+ virtio_queue_set_notification(svq->vq, false);
172
+
173
+ while (true) {
174
+ VirtQueueElement *elem;
175
+ bool ok;
176
+
177
+ if (svq->next_guest_avail_elem) {
178
+ elem = g_steal_pointer(&svq->next_guest_avail_elem);
179
+ } else {
180
+ elem = virtqueue_pop(svq->vq, sizeof(*elem));
181
+ }
182
+
183
+ if (!elem) {
184
+ break;
185
+ }
186
+
187
+ if (elem->out_num + elem->in_num >
188
+ vhost_svq_available_slots(svq)) {
189
+ /*
190
+ * This condition is possible since a contiguous buffer in GPA
191
+ * does not imply a contiguous buffer in qemu's VA
192
+ * scatter-gather segments. If that happens, the buffer exposed
193
+ * to the device needs to be a chain of descriptors at this
194
+ * moment.
195
+ *
196
+ * SVQ cannot hold more available buffers if we are here:
197
+ * queue the current guest descriptor and ignore further kicks
198
+ * until some elements are used.
199
+ */
200
+ svq->next_guest_avail_elem = elem;
201
+ return;
202
+ }
203
+
204
+ ok = vhost_svq_add(svq, elem);
205
+ if (unlikely(!ok)) {
206
+ /* VQ is broken, just return and ignore any other kicks */
207
+ return;
208
+ }
209
+ vhost_svq_kick(svq);
210
+ }
211
+
212
+ virtio_queue_set_notification(svq->vq, true);
213
+ } while (!virtio_queue_empty(svq->vq));
214
+}
215
+
216
+/**
217
+ * Handle guest's kick.
218
*
219
* @n: guest kick event notifier, the one that guest set to notify svq.
220
*/
221
-static void vhost_handle_guest_kick(EventNotifier *n)
222
+static void vhost_handle_guest_kick_notifier(EventNotifier *n)
223
{
224
VhostShadowVirtqueue *svq = container_of(n, VhostShadowVirtqueue,
225
svq_kick);
226
event_notifier_test_and_clear(n);
227
- event_notifier_set(&svq->hdev_kick);
228
+ vhost_handle_guest_kick(svq);
229
+}
230
+
231
+static bool vhost_svq_more_used(VhostShadowVirtqueue *svq)
232
+{
233
+ if (svq->last_used_idx != svq->shadow_used_idx) {
234
+ return true;
235
+ }
236
+
237
+ svq->shadow_used_idx = cpu_to_le16(svq->vring.used->idx);
238
+
239
+ return svq->last_used_idx != svq->shadow_used_idx;
240
}
241
242
/**
243
- * Forward vhost notifications
244
+ * Enable vhost device calls after disable them.
245
+ *
246
+ * @svq: The svq
247
+ *
248
+ * It returns false if there are pending used buffers from the vhost device,
249
+ * avoiding the possible races between SVQ checking for more work and enabling
250
+ * callbacks. True if SVQ used vring has no more pending buffers.
251
+ */
252
+static bool vhost_svq_enable_notification(VhostShadowVirtqueue *svq)
253
+{
254
+ svq->vring.avail->flags &= ~cpu_to_le16(VRING_AVAIL_F_NO_INTERRUPT);
255
+ /* Make sure the flag is written before the read of used_idx */
256
+ smp_mb();
257
+ return !vhost_svq_more_used(svq);
258
+}
259
+
260
+static void vhost_svq_disable_notification(VhostShadowVirtqueue *svq)
261
+{
262
+ svq->vring.avail->flags |= cpu_to_le16(VRING_AVAIL_F_NO_INTERRUPT);
263
+}
264
+
265
+static VirtQueueElement *vhost_svq_get_buf(VhostShadowVirtqueue *svq,
266
+ uint32_t *len)
267
+{
268
+ vring_desc_t *descs = svq->vring.desc;
269
+ const vring_used_t *used = svq->vring.used;
270
+ vring_used_elem_t used_elem;
271
+ uint16_t last_used;
272
+
273
+ if (!vhost_svq_more_used(svq)) {
274
+ return NULL;
275
+ }
276
+
277
+ /* Only get used array entries after they have been exposed by dev */
278
+ smp_rmb();
279
+ last_used = svq->last_used_idx & (svq->vring.num - 1);
280
+ used_elem.id = le32_to_cpu(used->ring[last_used].id);
281
+ used_elem.len = le32_to_cpu(used->ring[last_used].len);
282
+
283
+ svq->last_used_idx++;
284
+ if (unlikely(used_elem.id >= svq->vring.num)) {
285
+ qemu_log_mask(LOG_GUEST_ERROR, "Device %s says index %u is used",
286
+ svq->vdev->name, used_elem.id);
287
+ return NULL;
288
+ }
289
+
290
+ if (unlikely(!svq->ring_id_maps[used_elem.id])) {
291
+ qemu_log_mask(LOG_GUEST_ERROR,
292
+ "Device %s says index %u is used, but it was not available",
293
+ svq->vdev->name, used_elem.id);
294
+ return NULL;
295
+ }
296
+
297
+ descs[used_elem.id].next = svq->free_head;
298
+ svq->free_head = used_elem.id;
299
+
300
+ *len = used_elem.len;
301
+ return g_steal_pointer(&svq->ring_id_maps[used_elem.id]);
302
+}
303
+
304
+static void vhost_svq_flush(VhostShadowVirtqueue *svq,
305
+ bool check_for_avail_queue)
306
+{
307
+ VirtQueue *vq = svq->vq;
308
+
309
+ /* Forward as many used buffers as possible. */
310
+ do {
311
+ unsigned i = 0;
312
+
313
+ vhost_svq_disable_notification(svq);
314
+ while (true) {
315
+ uint32_t len;
316
+ g_autofree VirtQueueElement *elem = vhost_svq_get_buf(svq, &len);
317
+ if (!elem) {
318
+ break;
319
+ }
320
+
321
+ if (unlikely(i >= svq->vring.num)) {
322
+ qemu_log_mask(LOG_GUEST_ERROR,
323
+ "More than %u used buffers obtained in a %u size SVQ",
324
+ i, svq->vring.num);
325
+ virtqueue_fill(vq, elem, len, i);
326
+ virtqueue_flush(vq, i);
327
+ return;
328
+ }
329
+ virtqueue_fill(vq, elem, len, i++);
330
+ }
331
+
332
+ virtqueue_flush(vq, i);
333
+ event_notifier_set(&svq->svq_call);
334
+
335
+ if (check_for_avail_queue && svq->next_guest_avail_elem) {
336
+ /*
337
+ * Avail ring was full when vhost_svq_flush was called, so it's a
338
+ * good moment to make more descriptors available if possible.
339
+ */
340
+ vhost_handle_guest_kick(svq);
341
+ }
342
+ } while (!vhost_svq_enable_notification(svq));
343
+}
344
+
345
+/**
346
+ * Forward used buffers.
347
*
348
* @n: hdev call event notifier, the one that device set to notify svq.
349
+ *
350
+ * Note that we are not making any buffers available in the loop, there is no
351
+ * way that it runs more than virtqueue size times.
352
*/
353
static void vhost_svq_handle_call(EventNotifier *n)
354
{
355
VhostShadowVirtqueue *svq = container_of(n, VhostShadowVirtqueue,
356
hdev_call);
357
event_notifier_test_and_clear(n);
358
- event_notifier_set(&svq->svq_call);
359
+ vhost_svq_flush(svq, true);
360
}
361
362
/**
363
@@ -XXX,XX +XXX,XX @@ void vhost_svq_set_svq_kick_fd(VhostShadowVirtqueue *svq, int svq_kick_fd)
364
if (poll_start) {
365
event_notifier_init_fd(svq_kick, svq_kick_fd);
366
event_notifier_set(svq_kick);
367
- event_notifier_set_handler(svq_kick, vhost_handle_guest_kick);
368
+ event_notifier_set_handler(svq_kick, vhost_handle_guest_kick_notifier);
369
+ }
370
+}
371
+
372
+/**
373
+ * Start the shadow virtqueue operation.
374
+ *
375
+ * @svq: Shadow Virtqueue
376
+ * @vdev: VirtIO device
377
+ * @vq: Virtqueue to shadow
378
+ */
379
+void vhost_svq_start(VhostShadowVirtqueue *svq, VirtIODevice *vdev,
380
+ VirtQueue *vq)
381
+{
382
+ size_t desc_size, driver_size, device_size;
383
+
384
+ svq->next_guest_avail_elem = NULL;
385
+ svq->shadow_avail_idx = 0;
386
+ svq->shadow_used_idx = 0;
387
+ svq->last_used_idx = 0;
388
+ svq->vdev = vdev;
389
+ svq->vq = vq;
390
+
391
+ svq->vring.num = virtio_queue_get_num(vdev, virtio_get_queue_index(vq));
392
+ driver_size = vhost_svq_driver_area_size(svq);
393
+ device_size = vhost_svq_device_area_size(svq);
394
+ svq->vring.desc = qemu_memalign(qemu_real_host_page_size, driver_size);
395
+ desc_size = sizeof(vring_desc_t) * svq->vring.num;
396
+ svq->vring.avail = (void *)((char *)svq->vring.desc + desc_size);
397
+ memset(svq->vring.desc, 0, driver_size);
398
+ svq->vring.used = qemu_memalign(qemu_real_host_page_size, device_size);
399
+ memset(svq->vring.used, 0, device_size);
400
+ svq->ring_id_maps = g_new0(VirtQueueElement *, svq->vring.num);
401
+ for (unsigned i = 0; i < svq->vring.num - 1; i++) {
402
+ svq->vring.desc[i].next = cpu_to_le16(i + 1);
403
}
404
}
405
406
@@ -XXX,XX +XXX,XX @@ void vhost_svq_set_svq_kick_fd(VhostShadowVirtqueue *svq, int svq_kick_fd)
407
void vhost_svq_stop(VhostShadowVirtqueue *svq)
408
{
409
event_notifier_set_handler(&svq->svq_kick, NULL);
410
+ g_autofree VirtQueueElement *next_avail_elem = NULL;
411
+
412
+ if (!svq->vq) {
413
+ return;
414
+ }
415
+
416
+ /* Send all pending used descriptors to guest */
417
+ vhost_svq_flush(svq, false);
418
+
419
+ for (unsigned i = 0; i < svq->vring.num; ++i) {
420
+ g_autofree VirtQueueElement *elem = NULL;
421
+ elem = g_steal_pointer(&svq->ring_id_maps[i]);
422
+ if (elem) {
423
+ virtqueue_detach_element(svq->vq, elem, 0);
424
+ }
425
+ }
426
+
427
+ next_avail_elem = g_steal_pointer(&svq->next_guest_avail_elem);
428
+ if (next_avail_elem) {
429
+ virtqueue_detach_element(svq->vq, next_avail_elem, 0);
430
+ }
431
+ svq->vq = NULL;
432
+ g_free(svq->ring_id_maps);
433
+ qemu_vfree(svq->vring.desc);
434
+ qemu_vfree(svq->vring.used);
435
}
436
437
/**
438
diff --git a/hw/virtio/vhost-shadow-virtqueue.h b/hw/virtio/vhost-shadow-virtqueue.h
57
index XXXXXXX..XXXXXXX 100644
439
index XXXXXXX..XXXXXXX 100644
58
--- a/hw/rdma/vmw/pvrdma_cmd.c
440
--- a/hw/virtio/vhost-shadow-virtqueue.h
59
+++ b/hw/rdma/vmw/pvrdma_cmd.c
441
+++ b/hw/virtio/vhost-shadow-virtqueue.h
60
@@ -XXX,XX +XXX,XX @@ static int create_cq_ring(PCIDevice *pci_dev , PvrdmaRing **ring,
442
@@ -XXX,XX +XXX,XX @@ typedef struct VhostShadowVirtqueue {
61
r = g_malloc(sizeof(*r));
443
62
*ring = r;
444
/* Guest's call notifier, where the SVQ calls guest. */
63
445
EventNotifier svq_call;
64
- r->ring_state = (struct pvrdma_ring *)
446
+
65
+ r->ring_state = (PvrdmaRingState *)
447
+ /* Virtio queue shadowing */
66
rdma_pci_dma_map(pci_dev, tbl[0], TARGET_PAGE_SIZE);
448
+ VirtQueue *vq;
67
449
+
68
if (!r->ring_state) {
450
+ /* Virtio device */
69
@@ -XXX,XX +XXX,XX @@ static int create_qp_rings(PCIDevice *pci_dev, uint64_t pdir_dma,
451
+ VirtIODevice *vdev;
70
*rings = sr;
452
+
71
453
+ /* Map for use the guest's descriptors */
72
/* Create send ring */
454
+ VirtQueueElement **ring_id_maps;
73
- sr->ring_state = (struct pvrdma_ring *)
455
+
74
+ sr->ring_state = (PvrdmaRingState *)
456
+ /* Next VirtQueue element that guest made available */
75
rdma_pci_dma_map(pci_dev, tbl[0], TARGET_PAGE_SIZE);
457
+ VirtQueueElement *next_guest_avail_elem;
76
if (!sr->ring_state) {
458
+
77
rdma_error_report("Failed to map to QP ring state");
459
+ /* Next head to expose to the device */
78
@@ -XXX,XX +XXX,XX @@ static int create_srq_ring(PCIDevice *pci_dev, PvrdmaRing **ring,
460
+ uint16_t shadow_avail_idx;
79
r = g_malloc(sizeof(*r));
461
+
80
*ring = r;
462
+ /* Next free descriptor */
81
463
+ uint16_t free_head;
82
- r->ring_state = (struct pvrdma_ring *)
464
+
83
+ r->ring_state = (PvrdmaRingState *)
465
+ /* Last seen used idx */
84
rdma_pci_dma_map(pci_dev, tbl[0], TARGET_PAGE_SIZE);
466
+ uint16_t shadow_used_idx;
85
if (!r->ring_state) {
467
+
86
rdma_error_report("Failed to map tp SRQ ring state");
468
+ /* Next head to consume from the device */
87
diff --git a/hw/rdma/vmw/pvrdma_dev_ring.c b/hw/rdma/vmw/pvrdma_dev_ring.c
469
+ uint16_t last_used_idx;
470
} VhostShadowVirtqueue;
471
472
bool vhost_svq_valid_features(uint64_t features, Error **errp);
473
@@ -XXX,XX +XXX,XX @@ void vhost_svq_get_vring_addr(const VhostShadowVirtqueue *svq,
474
size_t vhost_svq_driver_area_size(const VhostShadowVirtqueue *svq);
475
size_t vhost_svq_device_area_size(const VhostShadowVirtqueue *svq);
476
477
+void vhost_svq_start(VhostShadowVirtqueue *svq, VirtIODevice *vdev,
478
+ VirtQueue *vq);
479
void vhost_svq_stop(VhostShadowVirtqueue *svq);
480
481
VhostShadowVirtqueue *vhost_svq_new(void);
482
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
88
index XXXXXXX..XXXXXXX 100644
483
index XXXXXXX..XXXXXXX 100644
89
--- a/hw/rdma/vmw/pvrdma_dev_ring.c
484
--- a/hw/virtio/vhost-vdpa.c
90
+++ b/hw/rdma/vmw/pvrdma_dev_ring.c
485
+++ b/hw/virtio/vhost-vdpa.c
91
@@ -XXX,XX +XXX,XX @@
486
@@ -XXX,XX +XXX,XX @@ static int vhost_vdpa_set_vring_dev_addr(struct vhost_dev *dev,
92
#include "trace.h"
487
* Note that this function does not rewind kick file descriptor if cannot set
93
488
* call one.
94
#include "../rdma_utils.h"
489
*/
95
-#include "standard-headers/drivers/infiniband/hw/vmw_pvrdma/pvrdma_ring.h"
490
-static bool vhost_vdpa_svq_setup(struct vhost_dev *dev,
96
#include "pvrdma_dev_ring.h"
491
- VhostShadowVirtqueue *svq,
97
492
- unsigned idx,
98
int pvrdma_ring_init(PvrdmaRing *ring, const char *name, PCIDevice *dev,
493
- Error **errp)
99
- struct pvrdma_ring *ring_state, uint32_t max_elems,
494
+static int vhost_vdpa_svq_set_fds(struct vhost_dev *dev,
100
+ PvrdmaRingState *ring_state, uint32_t max_elems,
495
+ VhostShadowVirtqueue *svq,
101
size_t elem_sz, dma_addr_t *tbl, uint32_t npages)
496
+ unsigned idx,
497
+ Error **errp)
102
{
498
{
103
int i;
499
struct vhost_vring_file file = {
104
@@ -XXX,XX +XXX,XX @@ out:
500
.index = dev->vq_index + idx,
105
501
@@ -XXX,XX +XXX,XX @@ static bool vhost_vdpa_svq_setup(struct vhost_dev *dev,
106
void *pvrdma_ring_next_elem_read(PvrdmaRing *ring)
502
r = vhost_vdpa_set_vring_dev_kick(dev, &file);
107
{
503
if (unlikely(r != 0)) {
108
- int e;
504
error_setg_errno(errp, -r, "Can't set device kick fd");
109
- unsigned int idx = 0, offset;
505
- return false;
110
+ unsigned int idx, offset;
506
+ return r;
111
+ const uint32_t tail = qatomic_read(&ring->ring_state->prod_tail);
112
+ const uint32_t head = qatomic_read(&ring->ring_state->cons_head);
113
114
- e = pvrdma_idx_ring_has_data(ring->ring_state, ring->max_elems, &idx);
115
- if (e <= 0) {
116
+ if (tail & ~((ring->max_elems << 1) - 1) ||
117
+ head & ~((ring->max_elems << 1) - 1) ||
118
+ tail == head) {
119
trace_pvrdma_ring_next_elem_read_no_data(ring->name);
120
return NULL;
121
}
507
}
122
508
123
+ idx = head & (ring->max_elems - 1);
509
event_notifier = &svq->hdev_call;
124
offset = idx * ring->elem_sz;
510
@@ -XXX,XX +XXX,XX @@ static bool vhost_vdpa_svq_setup(struct vhost_dev *dev,
125
return ring->pages[offset / TARGET_PAGE_SIZE] + (offset % TARGET_PAGE_SIZE);
511
error_setg_errno(errp, -r, "Can't set device call fd");
512
}
513
514
+ return r;
515
+}
516
+
517
+/**
518
+ * Unmap a SVQ area in the device
519
+ */
520
+static bool vhost_vdpa_svq_unmap_ring(struct vhost_vdpa *v, hwaddr iova,
521
+ hwaddr size)
522
+{
523
+ int r;
524
+
525
+ size = ROUND_UP(size, qemu_real_host_page_size);
526
+ r = vhost_vdpa_dma_unmap(v, iova, size);
527
+ return r == 0;
528
+}
529
+
530
+static bool vhost_vdpa_svq_unmap_rings(struct vhost_dev *dev,
531
+ const VhostShadowVirtqueue *svq)
532
+{
533
+ struct vhost_vdpa *v = dev->opaque;
534
+ struct vhost_vring_addr svq_addr;
535
+ size_t device_size = vhost_svq_device_area_size(svq);
536
+ size_t driver_size = vhost_svq_driver_area_size(svq);
537
+ bool ok;
538
+
539
+ vhost_svq_get_vring_addr(svq, &svq_addr);
540
+
541
+ ok = vhost_vdpa_svq_unmap_ring(v, svq_addr.desc_user_addr, driver_size);
542
+ if (unlikely(!ok)) {
543
+ return false;
544
+ }
545
+
546
+ return vhost_vdpa_svq_unmap_ring(v, svq_addr.used_user_addr, device_size);
547
+}
548
+
549
+/**
550
+ * Map the shadow virtqueue rings in the device
551
+ *
552
+ * @dev: The vhost device
553
+ * @svq: The shadow virtqueue
554
+ * @addr: Assigned IOVA addresses
555
+ * @errp: Error pointer
556
+ */
557
+static bool vhost_vdpa_svq_map_rings(struct vhost_dev *dev,
558
+ const VhostShadowVirtqueue *svq,
559
+ struct vhost_vring_addr *addr,
560
+ Error **errp)
561
+{
562
+ struct vhost_vdpa *v = dev->opaque;
563
+ size_t device_size = vhost_svq_device_area_size(svq);
564
+ size_t driver_size = vhost_svq_driver_area_size(svq);
565
+ int r;
566
+
567
+ ERRP_GUARD();
568
+ vhost_svq_get_vring_addr(svq, addr);
569
+
570
+ r = vhost_vdpa_dma_map(v, addr->desc_user_addr, driver_size,
571
+ (void *)addr->desc_user_addr, true);
572
+ if (unlikely(r != 0)) {
573
+ error_setg_errno(errp, -r, "Cannot create vq driver region: ");
574
+ return false;
575
+ }
576
+
577
+ r = vhost_vdpa_dma_map(v, addr->used_user_addr, device_size,
578
+ (void *)addr->used_user_addr, false);
579
+ if (unlikely(r != 0)) {
580
+ error_setg_errno(errp, -r, "Cannot create vq device region: ");
581
+ }
582
+
583
+ return r == 0;
584
+}
585
+
586
+static bool vhost_vdpa_svq_setup(struct vhost_dev *dev,
587
+ VhostShadowVirtqueue *svq,
588
+ unsigned idx,
589
+ Error **errp)
590
+{
591
+ uint16_t vq_index = dev->vq_index + idx;
592
+ struct vhost_vring_state s = {
593
+ .index = vq_index,
594
+ };
595
+ int r;
596
+
597
+ r = vhost_vdpa_set_dev_vring_base(dev, &s);
598
+ if (unlikely(r)) {
599
+ error_setg_errno(errp, -r, "Cannot set vring base");
600
+ return false;
601
+ }
602
+
603
+ r = vhost_vdpa_svq_set_fds(dev, svq, idx, errp);
604
return r == 0;
126
}
605
}
127
606
128
void pvrdma_ring_read_inc(PvrdmaRing *ring)
607
@@ -XXX,XX +XXX,XX @@ static bool vhost_vdpa_svqs_start(struct vhost_dev *dev)
129
{
130
- pvrdma_idx_ring_inc(&ring->ring_state->cons_head, ring->max_elems);
131
+ uint32_t idx = qatomic_read(&ring->ring_state->cons_head);
132
+
133
+ idx = (idx + 1) & ((ring->max_elems << 1) - 1);
134
+ qatomic_set(&ring->ring_state->cons_head, idx);
135
}
136
137
void *pvrdma_ring_next_elem_write(PvrdmaRing *ring)
138
{
139
- int idx;
140
- unsigned int offset, tail;
141
+ unsigned int idx, offset;
142
+ const uint32_t tail = qatomic_read(&ring->ring_state->prod_tail);
143
+ const uint32_t head = qatomic_read(&ring->ring_state->cons_head);
144
145
- idx = pvrdma_idx_ring_has_space(ring->ring_state, ring->max_elems, &tail);
146
- if (idx <= 0) {
147
+ if (tail & ~((ring->max_elems << 1) - 1) ||
148
+ head & ~((ring->max_elems << 1) - 1) ||
149
+ tail == (head ^ ring->max_elems)) {
150
rdma_error_report("CQ is full");
151
return NULL;
152
}
608
}
153
609
154
- idx = pvrdma_idx(&ring->ring_state->prod_tail, ring->max_elems);
610
for (i = 0; i < v->shadow_vqs->len; ++i) {
155
- if (idx < 0 || tail != idx) {
611
+ VirtQueue *vq = virtio_get_queue(dev->vdev, dev->vq_index + i);
156
- rdma_error_report("Invalid idx %d", idx);
612
VhostShadowVirtqueue *svq = g_ptr_array_index(v->shadow_vqs, i);
157
- return NULL;
613
+ struct vhost_vring_addr addr = {
158
- }
614
+ .index = i,
159
-
615
+ };
160
+ idx = tail & (ring->max_elems - 1);
616
+ int r;
161
offset = idx * ring->elem_sz;
617
bool ok = vhost_vdpa_svq_setup(dev, svq, i, &err);
162
return ring->pages[offset / TARGET_PAGE_SIZE] + (offset % TARGET_PAGE_SIZE);
618
if (unlikely(!ok)) {
163
}
619
- error_reportf_err(err, "Cannot setup SVQ %u: ", i);
164
620
+ goto err;
165
void pvrdma_ring_write_inc(PvrdmaRing *ring)
621
+ }
166
{
622
+
167
- pvrdma_idx_ring_inc(&ring->ring_state->prod_tail, ring->max_elems);
623
+ vhost_svq_start(svq, dev->vdev, vq);
168
+ uint32_t idx = qatomic_read(&ring->ring_state->prod_tail);
624
+ ok = vhost_vdpa_svq_map_rings(dev, svq, &addr, &err);
169
+
625
+ if (unlikely(!ok)) {
170
+ idx = (idx + 1) & ((ring->max_elems << 1) - 1);
626
+ goto err_map;
171
+ qatomic_set(&ring->ring_state->prod_tail, idx);
627
+ }
172
}
628
+
173
629
+ /* Override vring GPA set by vhost subsystem */
174
void pvrdma_ring_free(PvrdmaRing *ring)
630
+ r = vhost_vdpa_set_vring_dev_addr(dev, &addr);
175
diff --git a/hw/rdma/vmw/pvrdma_dev_ring.h b/hw/rdma/vmw/pvrdma_dev_ring.h
631
+ if (unlikely(r != 0)) {
176
index XXXXXXX..XXXXXXX 100644
632
+ error_setg_errno(&err, -r, "Cannot set device address");
177
--- a/hw/rdma/vmw/pvrdma_dev_ring.h
633
+ goto err_set_addr;
178
+++ b/hw/rdma/vmw/pvrdma_dev_ring.h
634
+ }
179
@@ -XXX,XX +XXX,XX @@
635
+ }
180
636
+
181
#define MAX_RING_NAME_SZ 32
637
+ return true;
182
638
+
183
+typedef struct PvrdmaRingState {
639
+err_set_addr:
184
+ int prod_tail; /* producer tail */
640
+ vhost_vdpa_svq_unmap_rings(dev, g_ptr_array_index(v->shadow_vqs, i));
185
+ int cons_head; /* consumer head */
641
+
186
+} PvrdmaRingState;
642
+err_map:
187
+
643
+ vhost_svq_stop(g_ptr_array_index(v->shadow_vqs, i));
188
typedef struct PvrdmaRing {
644
+
189
char name[MAX_RING_NAME_SZ];
645
+err:
190
PCIDevice *dev;
646
+ error_reportf_err(err, "Cannot setup SVQ %u: ", i);
191
uint32_t max_elems;
647
+ for (unsigned j = 0; j < i; ++j) {
192
size_t elem_sz;
648
+ VhostShadowVirtqueue *svq = g_ptr_array_index(v->shadow_vqs, j);
193
- struct pvrdma_ring *ring_state; /* used only for unmap */
649
+ vhost_vdpa_svq_unmap_rings(dev, svq);
194
+ PvrdmaRingState *ring_state; /* used only for unmap */
650
+ vhost_svq_stop(svq);
195
int npages;
651
+ }
196
void **pages;
652
+
197
} PvrdmaRing;
653
+ return false;
198
654
+}
199
int pvrdma_ring_init(PvrdmaRing *ring, const char *name, PCIDevice *dev,
655
+
200
- struct pvrdma_ring *ring_state, uint32_t max_elems,
656
+static bool vhost_vdpa_svqs_stop(struct vhost_dev *dev)
201
+ PvrdmaRingState *ring_state, uint32_t max_elems,
657
+{
202
size_t elem_sz, dma_addr_t *tbl, uint32_t npages);
658
+ struct vhost_vdpa *v = dev->opaque;
203
void *pvrdma_ring_next_elem_read(PvrdmaRing *ring);
659
+
204
void pvrdma_ring_read_inc(PvrdmaRing *ring);
660
+ if (!v->shadow_vqs) {
205
diff --git a/hw/rdma/vmw/pvrdma_main.c b/hw/rdma/vmw/pvrdma_main.c
661
+ return true;
206
index XXXXXXX..XXXXXXX 100644
662
+ }
207
--- a/hw/rdma/vmw/pvrdma_main.c
663
+
208
+++ b/hw/rdma/vmw/pvrdma_main.c
664
+ for (unsigned i = 0; i < v->shadow_vqs->len; ++i) {
209
@@ -XXX,XX +XXX,XX @@ static void free_dev_ring(PCIDevice *pci_dev, PvrdmaRing *ring,
665
+ VhostShadowVirtqueue *svq = g_ptr_array_index(v->shadow_vqs,
210
rdma_pci_dma_unmap(pci_dev, ring_state, TARGET_PAGE_SIZE);
666
+ i);
211
}
667
+ bool ok = vhost_vdpa_svq_unmap_rings(dev, svq);
212
668
+ if (unlikely(!ok)) {
213
-static int init_dev_ring(PvrdmaRing *ring, struct pvrdma_ring **ring_state,
669
return false;
214
+static int init_dev_ring(PvrdmaRing *ring, PvrdmaRingState **ring_state,
670
}
215
const char *name, PCIDevice *pci_dev,
671
}
216
dma_addr_t dir_addr, uint32_t num_pages)
672
@@ -XXX,XX +XXX,XX @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
217
{
673
}
218
@@ -XXX,XX +XXX,XX @@ static int init_dev_ring(PvrdmaRing *ring, struct pvrdma_ring **ring_state,
674
vhost_vdpa_set_vring_ready(dev);
219
/* RX ring is the second */
675
} else {
220
(*ring_state)++;
676
+ ok = vhost_vdpa_svqs_stop(dev);
221
rc = pvrdma_ring_init(ring, name, pci_dev,
677
+ if (unlikely(!ok)) {
222
- (struct pvrdma_ring *)*ring_state,
678
+ return -1;
223
+ (PvrdmaRingState *)*ring_state,
679
+ }
224
(num_pages - 1) * TARGET_PAGE_SIZE /
680
vhost_vdpa_host_notifiers_uninit(dev, dev->nvqs);
225
sizeof(struct pvrdma_cqne),
681
}
226
sizeof(struct pvrdma_cqne),
682
227
diff --git a/include/standard-headers/drivers/infiniband/hw/vmw_pvrdma/pvrdma_ring.h b/include/standard-headers/drivers/infiniband/hw/vmw_pvrdma/pvrdma_ring.h
228
deleted file mode 100644
229
index XXXXXXX..XXXXXXX
230
--- a/include/standard-headers/drivers/infiniband/hw/vmw_pvrdma/pvrdma_ring.h
231
+++ /dev/null
232
@@ -XXX,XX +XXX,XX @@
233
-/*
234
- * Copyright (c) 2012-2016 VMware, Inc. All rights reserved.
235
- *
236
- * This program is free software; you can redistribute it and/or
237
- * modify it under the terms of EITHER the GNU General Public License
238
- * version 2 as published by the Free Software Foundation or the BSD
239
- * 2-Clause License. This program is distributed in the hope that it
240
- * will be useful, but WITHOUT ANY WARRANTY; WITHOUT EVEN THE IMPLIED
241
- * WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
242
- * See the GNU General Public License version 2 for more details at
243
- * http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html.
244
- *
245
- * You should have received a copy of the GNU General Public License
246
- * along with this program available in the file COPYING in the main
247
- * directory of this source tree.
248
- *
249
- * The BSD 2-Clause License
250
- *
251
- * Redistribution and use in source and binary forms, with or
252
- * without modification, are permitted provided that the following
253
- * conditions are met:
254
- *
255
- * - Redistributions of source code must retain the above
256
- * copyright notice, this list of conditions and the following
257
- * disclaimer.
258
- *
259
- * - Redistributions in binary form must reproduce the above
260
- * copyright notice, this list of conditions and the following
261
- * disclaimer in the documentation and/or other materials
262
- * provided with the distribution.
263
- *
264
- * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
265
- * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
266
- * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
267
- * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
268
- * COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
269
- * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
270
- * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
271
- * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
272
- * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
273
- * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
274
- * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
275
- * OF THE POSSIBILITY OF SUCH DAMAGE.
276
- */
277
-
278
-#ifndef __PVRDMA_RING_H__
279
-#define __PVRDMA_RING_H__
280
-
281
-#include "standard-headers/linux/types.h"
282
-
283
-#define PVRDMA_INVALID_IDX    -1    /* Invalid index. */
284
-
285
-struct pvrdma_ring {
286
-    int prod_tail;    /* Producer tail. */
287
-    int cons_head;    /* Consumer head. */
288
-};
289
-
290
-struct pvrdma_ring_state {
291
-    struct pvrdma_ring tx;    /* Tx ring. */
292
-    struct pvrdma_ring rx;    /* Rx ring. */
293
-};
294
-
295
-static inline int pvrdma_idx_valid(uint32_t idx, uint32_t max_elems)
296
-{
297
-    /* Generates fewer instructions than a less-than. */
298
-    return (idx & ~((max_elems << 1) - 1)) == 0;
299
-}
300
-
301
-static inline int32_t pvrdma_idx(int *var, uint32_t max_elems)
302
-{
303
-    const unsigned int idx = qatomic_read(var);
304
-
305
-    if (pvrdma_idx_valid(idx, max_elems))
306
-        return idx & (max_elems - 1);
307
-    return PVRDMA_INVALID_IDX;
308
-}
309
-
310
-static inline void pvrdma_idx_ring_inc(int *var, uint32_t max_elems)
311
-{
312
-    uint32_t idx = qatomic_read(var) + 1;    /* Increment. */
313
-
314
-    idx &= (max_elems << 1) - 1;        /* Modulo size, flip gen. */
315
-    qatomic_set(var, idx);
316
-}
317
-
318
-static inline int32_t pvrdma_idx_ring_has_space(const struct pvrdma_ring *r,
319
-                     uint32_t max_elems, uint32_t *out_tail)
320
-{
321
-    const uint32_t tail = qatomic_read(&r->prod_tail);
322
-    const uint32_t head = qatomic_read(&r->cons_head);
323
-
324
-    if (pvrdma_idx_valid(tail, max_elems) &&
325
-     pvrdma_idx_valid(head, max_elems)) {
326
-        *out_tail = tail & (max_elems - 1);
327
-        return tail != (head ^ max_elems);
328
-    }
329
-    return PVRDMA_INVALID_IDX;
330
-}
331
-
332
-static inline int32_t pvrdma_idx_ring_has_data(const struct pvrdma_ring *r,
333
-                     uint32_t max_elems, uint32_t *out_head)
334
-{
335
-    const uint32_t tail = qatomic_read(&r->prod_tail);
336
-    const uint32_t head = qatomic_read(&r->cons_head);
337
-
338
-    if (pvrdma_idx_valid(tail, max_elems) &&
339
-     pvrdma_idx_valid(head, max_elems)) {
340
-        *out_head = head & (max_elems - 1);
341
-        return tail != head;
342
-    }
343
-    return PVRDMA_INVALID_IDX;
344
-}
345
-
346
-#endif /* __PVRDMA_RING_H__ */
347
diff --git a/scripts/update-linux-headers.sh b/scripts/update-linux-headers.sh
348
index XXXXXXX..XXXXXXX 100755
349
--- a/scripts/update-linux-headers.sh
350
+++ b/scripts/update-linux-headers.sh
351
@@ -XXX,XX +XXX,XX @@ sed -e '1h;2,$H;$!d;g' -e 's/[^};]*pvrdma[^(| ]*([^)]*);//g' \
352
"$linux/drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.h" > \
353
"$tmp_pvrdma_verbs";
354
355
-for i in "$linux/drivers/infiniband/hw/vmw_pvrdma/pvrdma_ring.h" \
356
- "$linux/drivers/infiniband/hw/vmw_pvrdma/pvrdma_dev_api.h" \
357
+for i in "$linux/drivers/infiniband/hw/vmw_pvrdma/pvrdma_dev_api.h" \
358
"$tmp_pvrdma_verbs"; do \
359
cp_portable "$i" \
360
"$output/include/standard-headers/drivers/infiniband/hw/vmw_pvrdma/"
361
--
683
--
362
2.7.4
684
2.7.4
363
685
364
686
diff view generated by jsdifflib
1
From: Alexander Bulekov <alxndr@bu.edu>
1
From: Eugenio Pérez <eperezma@redhat.com>
2
2
3
This patch switches to use qemu_receive_packet() which can detect
3
This iova tree function allows it to look for a hole in allocated
4
reentrancy and return early.
4
regions and return a totally new translation for a given translated
5
5
address.
6
This is intended to address CVE-2021-3416.
6
7
7
It's usage is mainly to allow devices to access qemu address space,
8
Cc: Prasad J Pandit <ppandit@redhat.com>
8
remapping guest's one into a new iova space where qemu can add chunks of
9
Cc: qemu-stable@nongnu.org
9
addresses.
10
Buglink: https://bugs.launchpad.net/qemu/+bug/1917085
10
11
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com
11
Acked-by: Michael S. Tsirkin <mst@redhat.com>
12
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
12
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
13
Reviewed-by: Peter Xu <peterx@redhat.com>
13
Signed-off-by: Jason Wang <jasowang@redhat.com>
14
Signed-off-by: Jason Wang <jasowang@redhat.com>
14
---
15
---
15
hw/net/pcnet.c | 2 +-
16
include/qemu/iova-tree.h | 18 +++++++
16
1 file changed, 1 insertion(+), 1 deletion(-)
17
util/iova-tree.c | 135 +++++++++++++++++++++++++++++++++++++++++++++++
17
18
2 files changed, 153 insertions(+)
18
diff --git a/hw/net/pcnet.c b/hw/net/pcnet.c
19
20
diff --git a/include/qemu/iova-tree.h b/include/qemu/iova-tree.h
19
index XXXXXXX..XXXXXXX 100644
21
index XXXXXXX..XXXXXXX 100644
20
--- a/hw/net/pcnet.c
22
--- a/include/qemu/iova-tree.h
21
+++ b/hw/net/pcnet.c
23
+++ b/include/qemu/iova-tree.h
22
@@ -XXX,XX +XXX,XX @@ txagain:
24
@@ -XXX,XX +XXX,XX @@
23
if (BCR_SWSTYLE(s) == 1)
25
#define IOVA_OK (0)
24
add_crc = !GET_FIELD(tmd.status, TMDS, NOFCS);
26
#define IOVA_ERR_INVALID (-1) /* Invalid parameters */
25
s->looptest = add_crc ? PCNET_LOOPTEST_CRC : PCNET_LOOPTEST_NOCRC;
27
#define IOVA_ERR_OVERLAP (-2) /* IOVA range overlapped */
26
- pcnet_receive(qemu_get_queue(s->nic), s->buffer, s->xmit_pos);
28
+#define IOVA_ERR_NOMEM (-3) /* Cannot allocate */
27
+ qemu_receive_packet(qemu_get_queue(s->nic), s->buffer, s->xmit_pos);
29
28
s->looptest = 0;
30
typedef struct IOVATree IOVATree;
29
} else {
31
typedef struct DMAMap {
30
if (s->nic) {
32
@@ -XXX,XX +XXX,XX @@ const DMAMap *iova_tree_find_address(const IOVATree *tree, hwaddr iova);
33
void iova_tree_foreach(IOVATree *tree, iova_tree_iterator iterator);
34
35
/**
36
+ * iova_tree_alloc_map:
37
+ *
38
+ * @tree: the iova tree to allocate from
39
+ * @map: the new map (as translated addr & size) to allocate in the iova region
40
+ * @iova_begin: the minimum address of the allocation
41
+ * @iova_end: the maximum addressable direction of the allocation
42
+ *
43
+ * Allocates a new region of a given size, between iova_min and iova_max.
44
+ *
45
+ * Return: Same as iova_tree_insert, but cannot overlap and can return error if
46
+ * iova tree is out of free contiguous range. The caller gets the assigned iova
47
+ * in map->iova.
48
+ */
49
+int iova_tree_alloc_map(IOVATree *tree, DMAMap *map, hwaddr iova_begin,
50
+ hwaddr iova_end);
51
+
52
+/**
53
* iova_tree_destroy:
54
*
55
* @tree: the iova tree to destroy
56
diff --git a/util/iova-tree.c b/util/iova-tree.c
57
index XXXXXXX..XXXXXXX 100644
58
--- a/util/iova-tree.c
59
+++ b/util/iova-tree.c
60
@@ -XXX,XX +XXX,XX @@ struct IOVATree {
61
GTree *tree;
62
};
63
64
+/* Args to pass to iova_tree_alloc foreach function. */
65
+struct IOVATreeAllocArgs {
66
+ /* Size of the desired allocation */
67
+ size_t new_size;
68
+
69
+ /* The minimum address allowed in the allocation */
70
+ hwaddr iova_begin;
71
+
72
+ /* Map at the left of the hole, can be NULL if "this" is first one */
73
+ const DMAMap *prev;
74
+
75
+ /* Map at the right of the hole, can be NULL if "prev" is the last one */
76
+ const DMAMap *this;
77
+
78
+ /* If found, we fill in the IOVA here */
79
+ hwaddr iova_result;
80
+
81
+ /* Whether have we found a valid IOVA */
82
+ bool iova_found;
83
+};
84
+
85
+/**
86
+ * Iterate args to the next hole
87
+ *
88
+ * @args: The alloc arguments
89
+ * @next: The next mapping in the tree. Can be NULL to signal the last one
90
+ */
91
+static void iova_tree_alloc_args_iterate(struct IOVATreeAllocArgs *args,
92
+ const DMAMap *next) {
93
+ args->prev = args->this;
94
+ args->this = next;
95
+}
96
+
97
static int iova_tree_compare(gconstpointer a, gconstpointer b, gpointer data)
98
{
99
const DMAMap *m1 = a, *m2 = b;
100
@@ -XXX,XX +XXX,XX @@ int iova_tree_remove(IOVATree *tree, const DMAMap *map)
101
return IOVA_OK;
102
}
103
104
+/**
105
+ * Try to find an unallocated IOVA range between prev and this elements.
106
+ *
107
+ * @args: Arguments to allocation
108
+ *
109
+ * Cases:
110
+ *
111
+ * (1) !prev, !this: No entries allocated, always succeed
112
+ *
113
+ * (2) !prev, this: We're iterating at the 1st element.
114
+ *
115
+ * (3) prev, !this: We're iterating at the last element.
116
+ *
117
+ * (4) prev, this: this is the most common case, we'll try to find a hole
118
+ * between "prev" and "this" mapping.
119
+ *
120
+ * Note that this function assumes the last valid iova is HWADDR_MAX, but it
121
+ * searches linearly so it's easy to discard the result if it's not the case.
122
+ */
123
+static void iova_tree_alloc_map_in_hole(struct IOVATreeAllocArgs *args)
124
+{
125
+ const DMAMap *prev = args->prev, *this = args->this;
126
+ uint64_t hole_start, hole_last;
127
+
128
+ if (this && this->iova + this->size < args->iova_begin) {
129
+ return;
130
+ }
131
+
132
+ hole_start = MAX(prev ? prev->iova + prev->size + 1 : 0, args->iova_begin);
133
+ hole_last = this ? this->iova : HWADDR_MAX;
134
+
135
+ if (hole_last - hole_start > args->new_size) {
136
+ args->iova_result = hole_start;
137
+ args->iova_found = true;
138
+ }
139
+}
140
+
141
+/**
142
+ * Foreach dma node in the tree, compare if there is a hole with its previous
143
+ * node (or minimum iova address allowed) and the node.
144
+ *
145
+ * @key: Node iterating
146
+ * @value: Node iterating
147
+ * @pargs: Struct to communicate with the outside world
148
+ *
149
+ * Return: false to keep iterating, true if needs break.
150
+ */
151
+static gboolean iova_tree_alloc_traverse(gpointer key, gpointer value,
152
+ gpointer pargs)
153
+{
154
+ struct IOVATreeAllocArgs *args = pargs;
155
+ DMAMap *node = value;
156
+
157
+ assert(key == value);
158
+
159
+ iova_tree_alloc_args_iterate(args, node);
160
+ iova_tree_alloc_map_in_hole(args);
161
+ return args->iova_found;
162
+}
163
+
164
+int iova_tree_alloc_map(IOVATree *tree, DMAMap *map, hwaddr iova_begin,
165
+ hwaddr iova_last)
166
+{
167
+ struct IOVATreeAllocArgs args = {
168
+ .new_size = map->size,
169
+ .iova_begin = iova_begin,
170
+ };
171
+
172
+ if (unlikely(iova_last < iova_begin)) {
173
+ return IOVA_ERR_INVALID;
174
+ }
175
+
176
+ /*
177
+ * Find a valid hole for the mapping
178
+ *
179
+ * Assuming low iova_begin, so no need to do a binary search to
180
+ * locate the first node.
181
+ *
182
+ * TODO: Replace all this with g_tree_node_first/next/last when available
183
+ * (from glib since 2.68). To do it with g_tree_foreach complicates the
184
+ * code a lot.
185
+ *
186
+ */
187
+ g_tree_foreach(tree->tree, iova_tree_alloc_traverse, &args);
188
+ if (!args.iova_found) {
189
+ /*
190
+ * Either tree is empty or the last hole is still not checked.
191
+ * g_tree_foreach does not compare (last, iova_last] range, so we check
192
+ * it here.
193
+ */
194
+ iova_tree_alloc_args_iterate(&args, NULL);
195
+ iova_tree_alloc_map_in_hole(&args);
196
+ }
197
+
198
+ if (!args.iova_found || args.iova_result + map->size > iova_last) {
199
+ return IOVA_ERR_NOMEM;
200
+ }
201
+
202
+ map->iova = args.iova_result;
203
+ return iova_tree_insert(tree, map);
204
+}
205
+
206
void iova_tree_destroy(IOVATree *tree)
207
{
208
g_tree_destroy(tree->tree);
31
--
209
--
32
2.7.4
210
2.7.4
33
211
34
212
diff view generated by jsdifflib
1
From: Alexey Kirillov <lekiravi@yandex-team.ru>
1
From: Eugenio Pérez <eperezma@redhat.com>
2
2
3
As we use QAPI NetClientState->stored_config to store and get information
3
This function does the reverse operation of iova_tree_find: To look for
4
about backend network devices, we can drop fill of legacy field info_str
4
a mapping that match a translated address so we can do the reverse.
5
for them.
6
5
7
We still use info_str field for NIC and hubports, so we can not completely
6
This have linear complexity instead of logarithmic, but it supports
8
remove it.
7
overlapping HVA. Future developments could reduce it.
9
8
10
Signed-off-by: Alexey Kirillov <lekiravi@yandex-team.ru>
9
Acked-by: Michael S. Tsirkin <mst@redhat.com>
10
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
11
Signed-off-by: Jason Wang <jasowang@redhat.com>
11
Signed-off-by: Jason Wang <jasowang@redhat.com>
12
---
12
---
13
net/l2tpv3.c | 2 --
13
include/qemu/iova-tree.h | 20 +++++++++++++++++++-
14
net/slirp.c | 3 ---
14
util/iova-tree.c | 34 ++++++++++++++++++++++++++++++++++
15
net/socket.c | 28 ----------------------------
15
2 files changed, 53 insertions(+), 1 deletion(-)
16
net/tap-win32.c | 2 --
17
net/tap.c | 9 ---------
18
net/vde.c | 2 --
19
net/vhost-user.c | 1 -
20
net/vhost-vdpa.c | 1 -
21
8 files changed, 48 deletions(-)
22
16
23
diff --git a/net/l2tpv3.c b/net/l2tpv3.c
17
diff --git a/include/qemu/iova-tree.h b/include/qemu/iova-tree.h
24
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
25
--- a/net/l2tpv3.c
19
--- a/include/qemu/iova-tree.h
26
+++ b/net/l2tpv3.c
20
+++ b/include/qemu/iova-tree.h
27
@@ -XXX,XX +XXX,XX @@ int net_init_l2tpv3(const Netdev *netdev,
21
@@ -XXX,XX +XXX,XX @@ int iova_tree_remove(IOVATree *tree, const DMAMap *map);
28
22
* @tree: the iova tree to search from
29
QAPI_CLONE_MEMBERS(NetdevL2TPv3Options,
23
* @map: the mapping to search
30
&nc->stored_config->u.l2tpv3, l2tpv3);
24
*
31
-
25
- * Search for a mapping in the iova tree that overlaps with the
32
- s->nc.info_str = g_strdup_printf("l2tpv3: connected");
26
+ * Search for a mapping in the iova tree that iova overlaps with the
33
return 0;
27
* mapping range specified. Only the first found mapping will be
34
outerr:
28
* returned.
35
qemu_del_net_client(nc);
29
*
36
diff --git a/net/slirp.c b/net/slirp.c
30
@@ -XXX,XX +XXX,XX @@ int iova_tree_remove(IOVATree *tree, const DMAMap *map);
31
const DMAMap *iova_tree_find(const IOVATree *tree, const DMAMap *map);
32
33
/**
34
+ * iova_tree_find_iova:
35
+ *
36
+ * @tree: the iova tree to search from
37
+ * @map: the mapping to search
38
+ *
39
+ * Search for a mapping in the iova tree that translated_addr overlaps with the
40
+ * mapping range specified. Only the first found mapping will be
41
+ * returned.
42
+ *
43
+ * Return: DMAMap pointer if found, or NULL if not found. Note that
44
+ * the returned DMAMap pointer is maintained internally. User should
45
+ * only read the content but never modify or free the content. Also,
46
+ * user is responsible to make sure the pointer is valid (say, no
47
+ * concurrent deletion in progress).
48
+ */
49
+const DMAMap *iova_tree_find_iova(const IOVATree *tree, const DMAMap *map);
50
+
51
+/**
52
* iova_tree_find_address:
53
*
54
* @tree: the iova tree to search from
55
diff --git a/util/iova-tree.c b/util/iova-tree.c
37
index XXXXXXX..XXXXXXX 100644
56
index XXXXXXX..XXXXXXX 100644
38
--- a/net/slirp.c
57
--- a/util/iova-tree.c
39
+++ b/net/slirp.c
58
+++ b/util/iova-tree.c
40
@@ -XXX,XX +XXX,XX @@ static int net_slirp_init(NetClientState *peer, const char *model,
59
@@ -XXX,XX +XXX,XX @@ struct IOVATreeAllocArgs {
41
stored->tftp_server_name = g_strdup(tftp_server_name);
60
bool iova_found;
42
}
61
};
43
62
44
- nc->info_str = g_strdup_printf("net=%s,restrict=%s", inet_ntoa(net),
63
+typedef struct IOVATreeFindIOVAArgs {
45
- restricted ? "on" : "off");
64
+ const DMAMap *needle;
46
-
65
+ const DMAMap *result;
47
s = DO_UPCAST(SlirpState, nc, nc);
66
+} IOVATreeFindIOVAArgs;
48
67
+
49
s->slirp = slirp_init(restricted, ipv4, net, mask, host,
68
/**
50
diff --git a/net/socket.c b/net/socket.c
69
* Iterate args to the next hole
51
index XXXXXXX..XXXXXXX 100644
70
*
52
--- a/net/socket.c
71
@@ -XXX,XX +XXX,XX @@ const DMAMap *iova_tree_find(const IOVATree *tree, const DMAMap *map)
53
+++ b/net/socket.c
72
return g_tree_lookup(tree->tree, map);
54
@@ -XXX,XX +XXX,XX @@ static void net_socket_send(void *opaque)
55
s->fd = -1;
56
net_socket_rs_init(&s->rs, net_socket_rs_finalize, false);
57
s->nc.link_down = true;
58
- g_free(s->nc.info_str);
59
- s->nc.info_str = g_new0(char, 1);
60
61
return;
62
}
63
@@ -XXX,XX +XXX,XX @@ static NetSocketState *net_socket_fd_init_dgram(NetClientState *peer,
64
stored->mcast = g_strdup(mcast);
65
66
s->dgram_dst = saddr;
67
- nc->info_str = g_strdup_printf("socket: fd=%d (cloned mcast=%s:%d)",
68
- fd, inet_ntoa(saddr.sin_addr),
69
- ntohs(saddr.sin_port));
70
} else {
71
if (sa_type == SOCKET_ADDRESS_TYPE_UNIX) {
72
s->dgram_dst.sin_family = AF_UNIX;
73
}
74
-
75
- nc->info_str = g_strdup_printf("socket: fd=%d %s",
76
- fd, SocketAddressType_str(sa_type));
77
}
78
79
return s;
80
@@ -XXX,XX +XXX,XX @@ static NetSocketState *net_socket_fd_init_stream(NetClientState *peer,
81
82
nc = qemu_new_net_client(&net_socket_info, peer, model, name);
83
84
- nc->info_str = g_strdup_printf("socket: fd=%d", fd);
85
-
86
s = DO_UPCAST(NetSocketState, nc, nc);
87
88
s->fd = fd;
89
@@ -XXX,XX +XXX,XX @@ static void net_socket_accept(void *opaque)
90
91
stored->has_fd = true;
92
stored->fd = g_strdup_printf("%d", fd);
93
-
94
- g_free(s->nc.info_str);
95
- s->nc.info_str = g_strdup_printf("socket: connection from %s:%d",
96
- inet_ntoa(saddr.sin_addr),
97
- ntohs(saddr.sin_port));
98
}
73
}
99
74
100
static int net_socket_listen_init(NetClientState *peer,
75
+static gboolean iova_tree_find_address_iterator(gpointer key, gpointer value,
101
@@ -XXX,XX +XXX,XX @@ static int net_socket_connect_init(NetClientState *peer,
76
+ gpointer data)
102
stored->has_connect = true;
77
+{
103
stored->connect = g_strdup(host_str);
78
+ const DMAMap *map = key;
104
79
+ IOVATreeFindIOVAArgs *args = data;
105
- g_free(s->nc.info_str);
80
+ const DMAMap *needle;
106
- s->nc.info_str = g_strdup_printf("socket: connect to %s:%d",
81
+
107
- inet_ntoa(saddr.sin_addr),
82
+ g_assert(key == value);
108
- ntohs(saddr.sin_port));
83
+
109
return 0;
84
+ needle = args->needle;
110
}
85
+ if (map->translated_addr + map->size < needle->translated_addr ||
111
86
+ needle->translated_addr + needle->size < map->translated_addr) {
112
@@ -XXX,XX +XXX,XX @@ static int net_socket_mcast_init(NetClientState *peer,
87
+ return false;
113
stored->localaddr = g_strdup(localaddr_str);
88
+ }
114
}
89
+
115
90
+ args->result = map;
116
- g_free(s->nc.info_str);
91
+ return true;
117
- s->nc.info_str = g_strdup_printf("socket: mcast=%s:%d",
92
+}
118
- inet_ntoa(saddr.sin_addr),
93
+
119
- ntohs(saddr.sin_port));
94
+const DMAMap *iova_tree_find_iova(const IOVATree *tree, const DMAMap *map)
120
return 0;
95
+{
121
-
96
+ IOVATreeFindIOVAArgs args = {
122
}
97
+ .needle = map,
123
98
+ };
124
static int net_socket_udp_init(NetClientState *peer,
99
+
125
@@ -XXX,XX +XXX,XX @@ static int net_socket_udp_init(NetClientState *peer,
100
+ g_tree_foreach(tree->tree, iova_tree_find_address_iterator, &args);
126
stored->has_udp = true;
101
+ return args.result;
127
stored->udp = g_strdup(rhost);
102
+}
128
103
+
129
- g_free(s->nc.info_str);
104
const DMAMap *iova_tree_find_address(const IOVATree *tree, hwaddr iova)
130
- s->nc.info_str = g_strdup_printf("socket: udp=%s:%d",
105
{
131
- inet_ntoa(raddr.sin_addr),
106
const DMAMap map = { .iova = iova, .size = 0 };
132
- ntohs(raddr.sin_port));
133
return 0;
134
}
135
136
diff --git a/net/tap-win32.c b/net/tap-win32.c
137
index XXXXXXX..XXXXXXX 100644
138
--- a/net/tap-win32.c
139
+++ b/net/tap-win32.c
140
@@ -XXX,XX +XXX,XX @@ static int tap_win32_init(NetClientState *peer, const char *model,
141
stored->has_ifname = true;
142
stored->ifname = g_strdup(ifname);
143
144
- s->nc.info_str = g_strdup_printf("tap: ifname=%s", ifname);
145
-
146
s->handle = handle;
147
148
qemu_add_wait_object(s->handle->tap_semaphore, tap_win32_send, s);
149
diff --git a/net/tap.c b/net/tap.c
150
index XXXXXXX..XXXXXXX 100644
151
--- a/net/tap.c
152
+++ b/net/tap.c
153
@@ -XXX,XX +XXX,XX @@ int net_init_bridge(const Netdev *netdev, const char *name,
154
stored->helper = g_strdup(helper);
155
}
156
157
- s->nc.info_str = g_strdup_printf("helper=%s,br=%s", helper, br);
158
-
159
return 0;
160
}
161
162
@@ -XXX,XX +XXX,XX @@ static void net_init_tap_one(const NetdevTapOptions *tap, NetClientState *peer,
163
stored->fds = g_strdup_printf("%s:%d", stored->fds, fd);
164
g_free(tmp_s);
165
}
166
-
167
- s->nc.info_str = g_strdup_printf("fd=%d", fd);
168
} else if (tap->has_helper) {
169
if (!stored->has_helper) {
170
stored->has_helper = true;
171
@@ -XXX,XX +XXX,XX @@ static void net_init_tap_one(const NetdevTapOptions *tap, NetClientState *peer,
172
stored->br = tap->has_br ? g_strdup(tap->br) :
173
g_strdup(DEFAULT_BRIDGE_INTERFACE);
174
}
175
-
176
- s->nc.info_str = g_strdup_printf("helper=%s", tap->helper);
177
} else {
178
if (ifname && !stored->has_ifname) {
179
stored->has_ifname = true;
180
@@ -XXX,XX +XXX,XX @@ static void net_init_tap_one(const NetdevTapOptions *tap, NetClientState *peer,
181
stored->downscript = g_strdup(downscript);
182
}
183
184
- s->nc.info_str = g_strdup_printf("ifname=%s,script=%s,downscript=%s",
185
- ifname, script, downscript);
186
-
187
if (strcmp(downscript, "no") != 0) {
188
snprintf(s->down_script, sizeof(s->down_script), "%s", downscript);
189
snprintf(s->down_script_arg, sizeof(s->down_script_arg),
190
diff --git a/net/vde.c b/net/vde.c
191
index XXXXXXX..XXXXXXX 100644
192
--- a/net/vde.c
193
+++ b/net/vde.c
194
@@ -XXX,XX +XXX,XX @@ static int net_vde_init(NetClientState *peer, const char *model,
195
196
nc = qemu_new_net_client(&net_vde_info, peer, model, name);
197
198
- nc->info_str = g_strdup_printf("sock=%s,fd=%d", sock, vde_datafd(vde));
199
-
200
s = DO_UPCAST(VDEState, nc, nc);
201
202
s->vde = vde;
203
diff --git a/net/vhost-user.c b/net/vhost-user.c
204
index XXXXXXX..XXXXXXX 100644
205
--- a/net/vhost-user.c
206
+++ b/net/vhost-user.c
207
@@ -XXX,XX +XXX,XX @@ static int net_vhost_user_init(NetClientState *peer, const char *device,
208
user = g_new0(struct VhostUserState, 1);
209
for (i = 0; i < queues; i++) {
210
nc = qemu_new_net_client(&net_vhost_user_info, peer, device, name);
211
- nc->info_str = g_strdup_printf("vhost-user%d to %s", i, chr->label);
212
nc->queue_index = i;
213
if (!nc0) {
214
nc0 = nc;
215
diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
216
index XXXXXXX..XXXXXXX 100644
217
--- a/net/vhost-vdpa.c
218
+++ b/net/vhost-vdpa.c
219
@@ -XXX,XX +XXX,XX @@ static int net_vhost_vdpa_init(NetClientState *peer, const char *device,
220
stored->has_queues = true;
221
stored->queues = 1; /* TODO: change when support multiqueue */
222
223
- nc->info_str = g_strdup_printf(TYPE_VHOST_VDPA);
224
nc->queue_index = 0;
225
s = DO_UPCAST(VhostVDPAState, nc, nc);
226
vdpa_device_fd = qemu_open_old(vhostdev, O_RDWR);
227
--
107
--
228
2.7.4
108
2.7.4
229
109
230
110
diff view generated by jsdifflib
1
From: Alexey Kirillov <lekiravi@yandex-team.ru>
1
From: Eugenio Pérez <eperezma@redhat.com>
2
2
3
A simply qtest that checks for correct number of netdevs in the response
3
This tree is able to look for a translated address from an IOVA address.
4
of the query-netdev.
5
4
6
Signed-off-by: Alexey Kirillov <lekiravi@yandex-team.ru>
5
At first glance it is similar to util/iova-tree. However, SVQ working on
7
Acked-by: Thomas Huth <thuth@redhat.com>
6
devices with limited IOVA space need more capabilities, like allocating
7
IOVA chunks or performing reverse translations (qemu addresses to iova).
8
9
The allocation capability, as "assign a free IOVA address to this chunk
10
of memory in qemu's address space" allows shadow virtqueue to create a
11
new address space that is not restricted by guest's addressable one, so
12
we can allocate shadow vqs vrings outside of it.
13
14
It duplicates the tree so it can search efficiently in both directions,
15
and it will signal overlap if iova or the translated address is present
16
in any tree.
17
18
Acked-by: Michael S. Tsirkin <mst@redhat.com>
19
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
8
Signed-off-by: Jason Wang <jasowang@redhat.com>
20
Signed-off-by: Jason Wang <jasowang@redhat.com>
9
---
21
---
10
tests/qtest/meson.build | 3 +
22
hw/virtio/meson.build | 2 +-
11
tests/qtest/test-query-netdev.c | 120 ++++++++++++++++++++++++++++++++++++++++
23
hw/virtio/vhost-iova-tree.c | 110 ++++++++++++++++++++++++++++++++++++++++++++
12
2 files changed, 123 insertions(+)
24
hw/virtio/vhost-iova-tree.h | 27 +++++++++++
13
create mode 100644 tests/qtest/test-query-netdev.c
25
3 files changed, 138 insertions(+), 1 deletion(-)
26
create mode 100644 hw/virtio/vhost-iova-tree.c
27
create mode 100644 hw/virtio/vhost-iova-tree.h
14
28
15
diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build
29
diff --git a/hw/virtio/meson.build b/hw/virtio/meson.build
16
index XXXXXXX..XXXXXXX 100644
30
index XXXXXXX..XXXXXXX 100644
17
--- a/tests/qtest/meson.build
31
--- a/hw/virtio/meson.build
18
+++ b/tests/qtest/meson.build
32
+++ b/hw/virtio/meson.build
19
@@ -XXX,XX +XXX,XX @@ qtests_generic = [
33
@@ -XXX,XX +XXX,XX @@ softmmu_ss.add(when: 'CONFIG_ALL', if_true: files('vhost-stub.c'))
20
if config_host.has_key('CONFIG_MODULES')
34
21
qtests_generic += [ 'modules-test' ]
35
virtio_ss = ss.source_set()
22
endif
36
virtio_ss.add(files('virtio.c'))
23
+if slirp.found()
37
-virtio_ss.add(when: 'CONFIG_VHOST', if_true: files('vhost.c', 'vhost-backend.c', 'vhost-shadow-virtqueue.c'))
24
+ qtests_generic += [ 'test-query-netdev' ]
38
+virtio_ss.add(when: 'CONFIG_VHOST', if_true: files('vhost.c', 'vhost-backend.c', 'vhost-shadow-virtqueue.c', 'vhost-iova-tree.c'))
25
+endif
39
virtio_ss.add(when: 'CONFIG_VHOST_USER', if_true: files('vhost-user.c'))
26
40
virtio_ss.add(when: 'CONFIG_VHOST_VDPA', if_true: files('vhost-vdpa.c'))
27
qtests_pci = \
41
virtio_ss.add(when: 'CONFIG_VIRTIO_BALLOON', if_true: files('virtio-balloon.c'))
28
(config_all_devices.has_key('CONFIG_VGA') ? ['display-vga-test'] : []) + \
42
diff --git a/hw/virtio/vhost-iova-tree.c b/hw/virtio/vhost-iova-tree.c
29
diff --git a/tests/qtest/test-query-netdev.c b/tests/qtest/test-query-netdev.c
30
new file mode 100644
43
new file mode 100644
31
index XXXXXXX..XXXXXXX
44
index XXXXXXX..XXXXXXX
32
--- /dev/null
45
--- /dev/null
33
+++ b/tests/qtest/test-query-netdev.c
46
+++ b/hw/virtio/vhost-iova-tree.c
34
@@ -XXX,XX +XXX,XX @@
47
@@ -XXX,XX +XXX,XX @@
35
+/*
48
+/*
36
+ * QTest testcase for the query-netdev
49
+ * vhost software live migration iova tree
37
+ *
50
+ *
38
+ * Copyright Yandex N.V., 2019
51
+ * SPDX-FileCopyrightText: Red Hat, Inc. 2021
52
+ * SPDX-FileContributor: Author: Eugenio Pérez <eperezma@redhat.com>
39
+ *
53
+ *
40
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
54
+ * SPDX-License-Identifier: GPL-2.0-or-later
41
+ * See the COPYING file in the top-level directory.
42
+ *
43
+ */
55
+ */
44
+
56
+
45
+#include "qemu/osdep.h"
57
+#include "qemu/osdep.h"
58
+#include "qemu/iova-tree.h"
59
+#include "vhost-iova-tree.h"
46
+
60
+
47
+#include "libqos/libqtest.h"
61
+#define iova_min_addr qemu_real_host_page_size
48
+#include "qapi/qmp/qdict.h"
49
+#include "qapi/qmp/qlist.h"
50
+
62
+
51
+/*
63
+/**
52
+ * Events can get in the way of responses we are actually waiting for.
64
+ * VhostIOVATree, able to:
65
+ * - Translate iova address
66
+ * - Reverse translate iova address (from translated to iova)
67
+ * - Allocate IOVA regions for translated range (linear operation)
53
+ */
68
+ */
54
+GCC_FMT_ATTR(2, 3)
69
+struct VhostIOVATree {
55
+static QObject *wait_command(QTestState *who, const char *command, ...)
70
+ /* First addressable iova address in the device */
71
+ uint64_t iova_first;
72
+
73
+ /* Last addressable iova address in the device */
74
+ uint64_t iova_last;
75
+
76
+ /* IOVA address to qemu memory maps. */
77
+ IOVATree *iova_taddr_map;
78
+};
79
+
80
+/**
81
+ * Create a new IOVA tree
82
+ *
83
+ * Returns the new IOVA tree
84
+ */
85
+VhostIOVATree *vhost_iova_tree_new(hwaddr iova_first, hwaddr iova_last)
56
+{
86
+{
57
+ va_list ap;
87
+ VhostIOVATree *tree = g_new(VhostIOVATree, 1);
58
+ QDict *response;
59
+ QObject *result;
60
+
88
+
61
+ va_start(ap, command);
89
+ /* Some devices do not like 0 addresses */
62
+ qtest_qmp_vsend(who, command, ap);
90
+ tree->iova_first = MAX(iova_first, iova_min_addr);
63
+ va_end(ap);
91
+ tree->iova_last = iova_last;
64
+
92
+
65
+ response = qtest_qmp_receive(who);
93
+ tree->iova_taddr_map = iova_tree_new();
66
+
94
+ return tree;
67
+ result = qdict_get(response, "return");
68
+ g_assert(result);
69
+ qobject_ref(result);
70
+ qobject_unref(response);
71
+
72
+ return result;
73
+}
95
+}
74
+
96
+
75
+static void qmp_query_netdev_no_error(QTestState *qts, size_t netdevs_count)
97
+/**
98
+ * Delete an iova tree
99
+ */
100
+void vhost_iova_tree_delete(VhostIOVATree *iova_tree)
76
+{
101
+{
77
+ QObject *resp;
102
+ iova_tree_destroy(iova_tree->iova_taddr_map);
78
+ QList *netdevs;
103
+ g_free(iova_tree);
79
+
80
+ resp = wait_command(qts, "{'execute': 'query-netdev'}");
81
+
82
+ netdevs = qobject_to(QList, resp);
83
+ g_assert(netdevs);
84
+ g_assert(qlist_size(netdevs) == netdevs_count);
85
+
86
+ qobject_unref(resp);
87
+}
104
+}
88
+
105
+
89
+static void test_query_netdev(void)
106
+/**
107
+ * Find the IOVA address stored from a memory address
108
+ *
109
+ * @tree: The iova tree
110
+ * @map: The map with the memory address
111
+ *
112
+ * Return the stored mapping, or NULL if not found.
113
+ */
114
+const DMAMap *vhost_iova_tree_find_iova(const VhostIOVATree *tree,
115
+ const DMAMap *map)
90
+{
116
+{
91
+ const char *arch = qtest_get_arch();
117
+ return iova_tree_find_iova(tree->iova_taddr_map, map);
92
+ QObject *resp;
93
+ QTestState *state;
94
+
95
+ /* Choosing machine for platforms without default one */
96
+ if (g_str_equal(arch, "arm") ||
97
+ g_str_equal(arch, "aarch64")) {
98
+ state = qtest_init(
99
+ "-nodefaults "
100
+ "-M virt "
101
+ "-netdev user,id=slirp0");
102
+ } else if (g_str_equal(arch, "tricore")) {
103
+ state = qtest_init(
104
+ "-nodefaults "
105
+ "-M tricore_testboard "
106
+ "-netdev user,id=slirp0");
107
+ } else if (g_str_equal(arch, "avr")) {
108
+ state = qtest_init(
109
+ "-nodefaults "
110
+ "-M mega2560 "
111
+ "-netdev user,id=slirp0");
112
+ } else if (g_str_equal(arch, "rx")) {
113
+ state = qtest_init(
114
+ "-nodefaults "
115
+ "-M gdbsim-r5f562n8 "
116
+ "-netdev user,id=slirp0");
117
+ } else {
118
+ state = qtest_init(
119
+ "-nodefaults "
120
+ "-netdev user,id=slirp0");
121
+ }
122
+ g_assert(state);
123
+
124
+ qmp_query_netdev_no_error(state, 1);
125
+
126
+ resp = wait_command(state,
127
+ "{'execute': 'netdev_add', 'arguments': {"
128
+ " 'id': 'slirp1',"
129
+ " 'type': 'user'}}");
130
+ qobject_unref(resp);
131
+
132
+ qmp_query_netdev_no_error(state, 2);
133
+
134
+ resp = wait_command(state,
135
+ "{'execute': 'netdev_del', 'arguments': {"
136
+ " 'id': 'slirp1'}}");
137
+ qobject_unref(resp);
138
+
139
+ qmp_query_netdev_no_error(state, 1);
140
+
141
+ qtest_quit(state);
142
+}
118
+}
143
+
119
+
144
+int main(int argc, char **argv)
120
+/**
121
+ * Allocate a new mapping
122
+ *
123
+ * @tree: The iova tree
124
+ * @map: The iova map
125
+ *
126
+ * Returns:
127
+ * - IOVA_OK if the map fits in the container
128
+ * - IOVA_ERR_INVALID if the map does not make sense (like size overflow)
129
+ * - IOVA_ERR_NOMEM if tree cannot allocate more space.
130
+ *
131
+ * It returns assignated iova in map->iova if return value is VHOST_DMA_MAP_OK.
132
+ */
133
+int vhost_iova_tree_map_alloc(VhostIOVATree *tree, DMAMap *map)
145
+{
134
+{
146
+ int ret = 0;
135
+ /* Some vhost devices do not like addr 0. Skip first page */
147
+ g_test_init(&argc, &argv, NULL);
136
+ hwaddr iova_first = tree->iova_first ?: qemu_real_host_page_size;
148
+
137
+
149
+ qtest_add_func("/net/qapi/query_netdev", test_query_netdev);
138
+ if (map->translated_addr + map->size < map->translated_addr ||
139
+ map->perm == IOMMU_NONE) {
140
+ return IOVA_ERR_INVALID;
141
+ }
150
+
142
+
151
+ ret = g_test_run();
143
+ /* Allocate a node in IOVA address */
144
+ return iova_tree_alloc_map(tree->iova_taddr_map, map, iova_first,
145
+ tree->iova_last);
146
+}
152
+
147
+
153
+ return ret;
148
+/**
149
+ * Remove existing mappings from iova tree
150
+ *
151
+ * @iova_tree: The vhost iova tree
152
+ * @map: The map to remove
153
+ */
154
+void vhost_iova_tree_remove(VhostIOVATree *iova_tree, const DMAMap *map)
155
+{
156
+ iova_tree_remove(iova_tree->iova_taddr_map, map);
154
+}
157
+}
158
diff --git a/hw/virtio/vhost-iova-tree.h b/hw/virtio/vhost-iova-tree.h
159
new file mode 100644
160
index XXXXXXX..XXXXXXX
161
--- /dev/null
162
+++ b/hw/virtio/vhost-iova-tree.h
163
@@ -XXX,XX +XXX,XX @@
164
+/*
165
+ * vhost software live migration iova tree
166
+ *
167
+ * SPDX-FileCopyrightText: Red Hat, Inc. 2021
168
+ * SPDX-FileContributor: Author: Eugenio Pérez <eperezma@redhat.com>
169
+ *
170
+ * SPDX-License-Identifier: GPL-2.0-or-later
171
+ */
172
+
173
+#ifndef HW_VIRTIO_VHOST_IOVA_TREE_H
174
+#define HW_VIRTIO_VHOST_IOVA_TREE_H
175
+
176
+#include "qemu/iova-tree.h"
177
+#include "exec/memory.h"
178
+
179
+typedef struct VhostIOVATree VhostIOVATree;
180
+
181
+VhostIOVATree *vhost_iova_tree_new(uint64_t iova_first, uint64_t iova_last);
182
+void vhost_iova_tree_delete(VhostIOVATree *iova_tree);
183
+G_DEFINE_AUTOPTR_CLEANUP_FUNC(VhostIOVATree, vhost_iova_tree_delete);
184
+
185
+const DMAMap *vhost_iova_tree_find_iova(const VhostIOVATree *iova_tree,
186
+ const DMAMap *map);
187
+int vhost_iova_tree_map_alloc(VhostIOVATree *iova_tree, DMAMap *map);
188
+void vhost_iova_tree_remove(VhostIOVATree *iova_tree, const DMAMap *map);
189
+
190
+#endif
155
--
191
--
156
2.7.4
192
2.7.4
157
193
158
194
diff view generated by jsdifflib
1
From: Alexander Bulekov <alxndr@bu.edu>
1
From: Eugenio Pérez <eperezma@redhat.com>
2
2
3
This patch switches to use qemu_receive_packet() which can detect
3
Use translations added in VhostIOVATree in SVQ.
4
reentrancy and return early.
4
5
5
Only introduce usage here, not allocation and deallocation. As with
6
This is intended to address CVE-2021-3416.
6
previous patches, we use the dead code paths of shadow_vqs_enabled to
7
7
avoid commiting too many changes at once. These are impossible to take
8
Cc: Prasad J Pandit <ppandit@redhat.com>
8
at the moment.
9
Cc: qemu-stable@nongnu.org
9
10
Buglink: https://bugs.launchpad.net/qemu/+bug/1910826
10
Acked-by: Michael S. Tsirkin <mst@redhat.com>
11
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com
11
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
12
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
13
Signed-off-by: Jason Wang <jasowang@redhat.com>
12
Signed-off-by: Jason Wang <jasowang@redhat.com>
14
---
13
---
15
hw/net/rtl8139.c | 2 +-
14
hw/virtio/vhost-shadow-virtqueue.c | 75 +++++++++++++++++++++--
16
1 file changed, 1 insertion(+), 1 deletion(-)
15
hw/virtio/vhost-shadow-virtqueue.h | 6 +-
17
16
hw/virtio/vhost-vdpa.c | 122 +++++++++++++++++++++++++++++++------
18
diff --git a/hw/net/rtl8139.c b/hw/net/rtl8139.c
17
include/hw/virtio/vhost-vdpa.h | 3 +
18
4 files changed, 181 insertions(+), 25 deletions(-)
19
20
diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
19
index XXXXXXX..XXXXXXX 100644
21
index XXXXXXX..XXXXXXX 100644
20
--- a/hw/net/rtl8139.c
22
--- a/hw/virtio/vhost-shadow-virtqueue.c
21
+++ b/hw/net/rtl8139.c
23
+++ b/hw/virtio/vhost-shadow-virtqueue.c
22
@@ -XXX,XX +XXX,XX @@ static void rtl8139_transfer_frame(RTL8139State *s, uint8_t *buf, int size,
24
@@ -XXX,XX +XXX,XX @@ static uint16_t vhost_svq_available_slots(const VhostShadowVirtqueue *svq)
25
return svq->vring.num - (svq->shadow_avail_idx - svq->shadow_used_idx);
26
}
27
28
+/**
29
+ * Translate addresses between the qemu's virtual address and the SVQ IOVA
30
+ *
31
+ * @svq: Shadow VirtQueue
32
+ * @vaddr: Translated IOVA addresses
33
+ * @iovec: Source qemu's VA addresses
34
+ * @num: Length of iovec and minimum length of vaddr
35
+ */
36
+static bool vhost_svq_translate_addr(const VhostShadowVirtqueue *svq,
37
+ void **addrs, const struct iovec *iovec,
38
+ size_t num)
39
+{
40
+ if (num == 0) {
41
+ return true;
42
+ }
43
+
44
+ for (size_t i = 0; i < num; ++i) {
45
+ DMAMap needle = {
46
+ .translated_addr = (hwaddr)iovec[i].iov_base,
47
+ .size = iovec[i].iov_len,
48
+ };
49
+ size_t off;
50
+
51
+ const DMAMap *map = vhost_iova_tree_find_iova(svq->iova_tree, &needle);
52
+ /*
53
+ * Map cannot be NULL since iova map contains all guest space and
54
+ * qemu already has a physical address mapped
55
+ */
56
+ if (unlikely(!map)) {
57
+ qemu_log_mask(LOG_GUEST_ERROR,
58
+ "Invalid address 0x%"HWADDR_PRIx" given by guest",
59
+ needle.translated_addr);
60
+ return false;
61
+ }
62
+
63
+ off = needle.translated_addr - map->translated_addr;
64
+ addrs[i] = (void *)(map->iova + off);
65
+
66
+ if (unlikely(int128_gt(int128_add(needle.translated_addr,
67
+ iovec[i].iov_len),
68
+ map->translated_addr + map->size))) {
69
+ qemu_log_mask(LOG_GUEST_ERROR,
70
+ "Guest buffer expands over iova range");
71
+ return false;
72
+ }
73
+ }
74
+
75
+ return true;
76
+}
77
+
78
static void vhost_vring_write_descs(VhostShadowVirtqueue *svq,
79
+ void * const *sg,
80
const struct iovec *iovec,
81
size_t num, bool more_descs, bool write)
82
{
83
@@ -XXX,XX +XXX,XX @@ static void vhost_vring_write_descs(VhostShadowVirtqueue *svq,
84
} else {
85
descs[i].flags = flags;
23
}
86
}
24
87
- descs[i].addr = cpu_to_le64((hwaddr)iovec[n].iov_base);
25
DPRINTF("+++ transmit loopback mode\n");
88
+ descs[i].addr = cpu_to_le64((hwaddr)sg[n]);
26
- rtl8139_do_receive(qemu_get_queue(s->nic), buf, size, do_interrupt);
89
descs[i].len = cpu_to_le32(iovec[n].iov_len);
27
+ qemu_receive_packet(qemu_get_queue(s->nic), buf, size);
90
28
91
last = i;
29
if (iov) {
92
@@ -XXX,XX +XXX,XX @@ static bool vhost_svq_add_split(VhostShadowVirtqueue *svq,
30
g_free(buf2);
93
{
94
unsigned avail_idx;
95
vring_avail_t *avail = svq->vring.avail;
96
+ bool ok;
97
+ g_autofree void **sgs = g_new(void *, MAX(elem->out_num, elem->in_num));
98
99
*head = svq->free_head;
100
101
@@ -XXX,XX +XXX,XX @@ static bool vhost_svq_add_split(VhostShadowVirtqueue *svq,
102
return false;
103
}
104
105
- vhost_vring_write_descs(svq, elem->out_sg, elem->out_num,
106
+ ok = vhost_svq_translate_addr(svq, sgs, elem->out_sg, elem->out_num);
107
+ if (unlikely(!ok)) {
108
+ return false;
109
+ }
110
+ vhost_vring_write_descs(svq, sgs, elem->out_sg, elem->out_num,
111
elem->in_num > 0, false);
112
- vhost_vring_write_descs(svq, elem->in_sg, elem->in_num, false, true);
113
+
114
+
115
+ ok = vhost_svq_translate_addr(svq, sgs, elem->in_sg, elem->in_num);
116
+ if (unlikely(!ok)) {
117
+ return false;
118
+ }
119
+
120
+ vhost_vring_write_descs(svq, sgs, elem->in_sg, elem->in_num, false, true);
121
122
/*
123
* Put the entry in the available array (but don't update avail->idx until
124
@@ -XXX,XX +XXX,XX @@ void vhost_svq_stop(VhostShadowVirtqueue *svq)
125
* Creates vhost shadow virtqueue, and instructs the vhost device to use the
126
* shadow methods and file descriptors.
127
*
128
+ * @iova_tree: Tree to perform descriptors translations
129
+ *
130
* Returns the new virtqueue or NULL.
131
*
132
* In case of error, reason is reported through error_report.
133
*/
134
-VhostShadowVirtqueue *vhost_svq_new(void)
135
+VhostShadowVirtqueue *vhost_svq_new(VhostIOVATree *iova_tree)
136
{
137
g_autofree VhostShadowVirtqueue *svq = g_new0(VhostShadowVirtqueue, 1);
138
int r;
139
@@ -XXX,XX +XXX,XX @@ VhostShadowVirtqueue *vhost_svq_new(void)
140
141
event_notifier_init_fd(&svq->svq_kick, VHOST_FILE_UNBIND);
142
event_notifier_set_handler(&svq->hdev_call, vhost_svq_handle_call);
143
+ svq->iova_tree = iova_tree;
144
return g_steal_pointer(&svq);
145
146
err_init_hdev_call:
147
diff --git a/hw/virtio/vhost-shadow-virtqueue.h b/hw/virtio/vhost-shadow-virtqueue.h
148
index XXXXXXX..XXXXXXX 100644
149
--- a/hw/virtio/vhost-shadow-virtqueue.h
150
+++ b/hw/virtio/vhost-shadow-virtqueue.h
151
@@ -XXX,XX +XXX,XX @@
152
#include "qemu/event_notifier.h"
153
#include "hw/virtio/virtio.h"
154
#include "standard-headers/linux/vhost_types.h"
155
+#include "hw/virtio/vhost-iova-tree.h"
156
157
/* Shadow virtqueue to relay notifications */
158
typedef struct VhostShadowVirtqueue {
159
@@ -XXX,XX +XXX,XX @@ typedef struct VhostShadowVirtqueue {
160
/* Virtio device */
161
VirtIODevice *vdev;
162
163
+ /* IOVA mapping */
164
+ VhostIOVATree *iova_tree;
165
+
166
/* Map for use the guest's descriptors */
167
VirtQueueElement **ring_id_maps;
168
169
@@ -XXX,XX +XXX,XX @@ void vhost_svq_start(VhostShadowVirtqueue *svq, VirtIODevice *vdev,
170
VirtQueue *vq);
171
void vhost_svq_stop(VhostShadowVirtqueue *svq);
172
173
-VhostShadowVirtqueue *vhost_svq_new(void);
174
+VhostShadowVirtqueue *vhost_svq_new(VhostIOVATree *iova_tree);
175
176
void vhost_svq_free(gpointer vq);
177
G_DEFINE_AUTOPTR_CLEANUP_FUNC(VhostShadowVirtqueue, vhost_svq_free);
178
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
179
index XXXXXXX..XXXXXXX 100644
180
--- a/hw/virtio/vhost-vdpa.c
181
+++ b/hw/virtio/vhost-vdpa.c
182
@@ -XXX,XX +XXX,XX @@ static void vhost_vdpa_listener_region_add(MemoryListener *listener,
183
vaddr, section->readonly);
184
185
llsize = int128_sub(llend, int128_make64(iova));
186
+ if (v->shadow_vqs_enabled) {
187
+ DMAMap mem_region = {
188
+ .translated_addr = (hwaddr)vaddr,
189
+ .size = int128_get64(llsize) - 1,
190
+ .perm = IOMMU_ACCESS_FLAG(true, section->readonly),
191
+ };
192
+
193
+ int r = vhost_iova_tree_map_alloc(v->iova_tree, &mem_region);
194
+ if (unlikely(r != IOVA_OK)) {
195
+ error_report("Can't allocate a mapping (%d)", r);
196
+ goto fail;
197
+ }
198
+
199
+ iova = mem_region.iova;
200
+ }
201
202
vhost_vdpa_iotlb_batch_begin_once(v);
203
ret = vhost_vdpa_dma_map(v, iova, int128_get64(llsize),
204
@@ -XXX,XX +XXX,XX @@ static void vhost_vdpa_listener_region_del(MemoryListener *listener,
205
206
llsize = int128_sub(llend, int128_make64(iova));
207
208
+ if (v->shadow_vqs_enabled) {
209
+ const DMAMap *result;
210
+ const void *vaddr = memory_region_get_ram_ptr(section->mr) +
211
+ section->offset_within_region +
212
+ (iova - section->offset_within_address_space);
213
+ DMAMap mem_region = {
214
+ .translated_addr = (hwaddr)vaddr,
215
+ .size = int128_get64(llsize) - 1,
216
+ };
217
+
218
+ result = vhost_iova_tree_find_iova(v->iova_tree, &mem_region);
219
+ iova = result->iova;
220
+ vhost_iova_tree_remove(v->iova_tree, &mem_region);
221
+ }
222
vhost_vdpa_iotlb_batch_begin_once(v);
223
ret = vhost_vdpa_dma_unmap(v, iova, int128_get64(llsize));
224
if (ret) {
225
@@ -XXX,XX +XXX,XX @@ static int vhost_vdpa_init_svq(struct vhost_dev *hdev, struct vhost_vdpa *v,
226
227
shadow_vqs = g_ptr_array_new_full(hdev->nvqs, vhost_svq_free);
228
for (unsigned n = 0; n < hdev->nvqs; ++n) {
229
- g_autoptr(VhostShadowVirtqueue) svq = vhost_svq_new();
230
+ g_autoptr(VhostShadowVirtqueue) svq = vhost_svq_new(v->iova_tree);
231
232
if (unlikely(!svq)) {
233
error_setg(errp, "Cannot create svq %u", n);
234
@@ -XXX,XX +XXX,XX @@ static int vhost_vdpa_svq_set_fds(struct vhost_dev *dev,
235
/**
236
* Unmap a SVQ area in the device
237
*/
238
-static bool vhost_vdpa_svq_unmap_ring(struct vhost_vdpa *v, hwaddr iova,
239
- hwaddr size)
240
+static bool vhost_vdpa_svq_unmap_ring(struct vhost_vdpa *v,
241
+ const DMAMap *needle)
242
{
243
+ const DMAMap *result = vhost_iova_tree_find_iova(v->iova_tree, needle);
244
+ hwaddr size;
245
int r;
246
247
- size = ROUND_UP(size, qemu_real_host_page_size);
248
- r = vhost_vdpa_dma_unmap(v, iova, size);
249
+ if (unlikely(!result)) {
250
+ error_report("Unable to find SVQ address to unmap");
251
+ return false;
252
+ }
253
+
254
+ size = ROUND_UP(result->size, qemu_real_host_page_size);
255
+ r = vhost_vdpa_dma_unmap(v, result->iova, size);
256
return r == 0;
257
}
258
259
static bool vhost_vdpa_svq_unmap_rings(struct vhost_dev *dev,
260
const VhostShadowVirtqueue *svq)
261
{
262
+ DMAMap needle = {};
263
struct vhost_vdpa *v = dev->opaque;
264
struct vhost_vring_addr svq_addr;
265
- size_t device_size = vhost_svq_device_area_size(svq);
266
- size_t driver_size = vhost_svq_driver_area_size(svq);
267
bool ok;
268
269
vhost_svq_get_vring_addr(svq, &svq_addr);
270
271
- ok = vhost_vdpa_svq_unmap_ring(v, svq_addr.desc_user_addr, driver_size);
272
+ needle.translated_addr = svq_addr.desc_user_addr;
273
+ ok = vhost_vdpa_svq_unmap_ring(v, &needle);
274
if (unlikely(!ok)) {
275
return false;
276
}
277
278
- return vhost_vdpa_svq_unmap_ring(v, svq_addr.used_user_addr, device_size);
279
+ needle.translated_addr = svq_addr.used_user_addr;
280
+ return vhost_vdpa_svq_unmap_ring(v, &needle);
281
+}
282
+
283
+/**
284
+ * Map the SVQ area in the device
285
+ *
286
+ * @v: Vhost-vdpa device
287
+ * @needle: The area to search iova
288
+ * @errorp: Error pointer
289
+ */
290
+static bool vhost_vdpa_svq_map_ring(struct vhost_vdpa *v, DMAMap *needle,
291
+ Error **errp)
292
+{
293
+ int r;
294
+
295
+ r = vhost_iova_tree_map_alloc(v->iova_tree, needle);
296
+ if (unlikely(r != IOVA_OK)) {
297
+ error_setg(errp, "Cannot allocate iova (%d)", r);
298
+ return false;
299
+ }
300
+
301
+ r = vhost_vdpa_dma_map(v, needle->iova, needle->size + 1,
302
+ (void *)needle->translated_addr,
303
+ needle->perm == IOMMU_RO);
304
+ if (unlikely(r != 0)) {
305
+ error_setg_errno(errp, -r, "Cannot map region to device");
306
+ vhost_iova_tree_remove(v->iova_tree, needle);
307
+ }
308
+
309
+ return r == 0;
310
}
311
312
/**
313
@@ -XXX,XX +XXX,XX @@ static bool vhost_vdpa_svq_map_rings(struct vhost_dev *dev,
314
struct vhost_vring_addr *addr,
315
Error **errp)
316
{
317
+ DMAMap device_region, driver_region;
318
+ struct vhost_vring_addr svq_addr;
319
struct vhost_vdpa *v = dev->opaque;
320
size_t device_size = vhost_svq_device_area_size(svq);
321
size_t driver_size = vhost_svq_driver_area_size(svq);
322
- int r;
323
+ size_t avail_offset;
324
+ bool ok;
325
326
ERRP_GUARD();
327
- vhost_svq_get_vring_addr(svq, addr);
328
+ vhost_svq_get_vring_addr(svq, &svq_addr);
329
330
- r = vhost_vdpa_dma_map(v, addr->desc_user_addr, driver_size,
331
- (void *)addr->desc_user_addr, true);
332
- if (unlikely(r != 0)) {
333
- error_setg_errno(errp, -r, "Cannot create vq driver region: ");
334
+ driver_region = (DMAMap) {
335
+ .translated_addr = svq_addr.desc_user_addr,
336
+ .size = driver_size - 1,
337
+ .perm = IOMMU_RO,
338
+ };
339
+ ok = vhost_vdpa_svq_map_ring(v, &driver_region, errp);
340
+ if (unlikely(!ok)) {
341
+ error_prepend(errp, "Cannot create vq driver region: ");
342
return false;
343
}
344
+ addr->desc_user_addr = driver_region.iova;
345
+ avail_offset = svq_addr.avail_user_addr - svq_addr.desc_user_addr;
346
+ addr->avail_user_addr = driver_region.iova + avail_offset;
347
348
- r = vhost_vdpa_dma_map(v, addr->used_user_addr, device_size,
349
- (void *)addr->used_user_addr, false);
350
- if (unlikely(r != 0)) {
351
- error_setg_errno(errp, -r, "Cannot create vq device region: ");
352
+ device_region = (DMAMap) {
353
+ .translated_addr = svq_addr.used_user_addr,
354
+ .size = device_size - 1,
355
+ .perm = IOMMU_RW,
356
+ };
357
+ ok = vhost_vdpa_svq_map_ring(v, &device_region, errp);
358
+ if (unlikely(!ok)) {
359
+ error_prepend(errp, "Cannot create vq device region: ");
360
+ vhost_vdpa_svq_unmap_ring(v, &driver_region);
361
}
362
+ addr->used_user_addr = device_region.iova;
363
364
- return r == 0;
365
+ return ok;
366
}
367
368
static bool vhost_vdpa_svq_setup(struct vhost_dev *dev,
369
diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
370
index XXXXXXX..XXXXXXX 100644
371
--- a/include/hw/virtio/vhost-vdpa.h
372
+++ b/include/hw/virtio/vhost-vdpa.h
373
@@ -XXX,XX +XXX,XX @@
374
375
#include <gmodule.h>
376
377
+#include "hw/virtio/vhost-iova-tree.h"
378
#include "hw/virtio/virtio.h"
379
#include "standard-headers/linux/vhost_types.h"
380
381
@@ -XXX,XX +XXX,XX @@ typedef struct vhost_vdpa {
382
MemoryListener listener;
383
struct vhost_vdpa_iova_range iova_range;
384
bool shadow_vqs_enabled;
385
+ /* IOVA mapping used by the Shadow Virtqueue */
386
+ VhostIOVATree *iova_tree;
387
GPtrArray *shadow_vqs;
388
struct vhost_dev *dev;
389
VhostVDPAHostNotifier notifier[VIRTIO_QUEUE_MAX];
31
--
390
--
32
2.7.4
391
2.7.4
33
392
34
393
diff view generated by jsdifflib
1
This patch switches to use qemu_receive_receive_iov() which can detect
1
From: Eugenio Pérez <eperezma@redhat.com>
2
reentrancy and return early.
3
2
4
This is intended to address CVE-2021-3416.
3
This is needed to achieve migration, so the destination can restore its
4
index.
5
5
6
Cc: Prasad J Pandit <ppandit@redhat.com>
6
Setting base as last used idx, so destination will see as available all
7
Cc: qemu-stable@nongnu.org
7
the entries that the device did not use, including the in-flight
8
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
8
processing ones.
9
10
This is ok for networking, but other kinds of devices might have
11
problems with these retransmissions.
12
13
Acked-by: Michael S. Tsirkin <mst@redhat.com>
14
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
9
Signed-off-by: Jason Wang <jasowang@redhat.com>
15
Signed-off-by: Jason Wang <jasowang@redhat.com>
10
---
16
---
11
hw/net/net_tx_pkt.c | 2 +-
17
hw/virtio/vhost-vdpa.c | 17 +++++++++++++++++
12
1 file changed, 1 insertion(+), 1 deletion(-)
18
1 file changed, 17 insertions(+)
13
19
14
diff --git a/hw/net/net_tx_pkt.c b/hw/net/net_tx_pkt.c
20
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
15
index XXXXXXX..XXXXXXX 100644
21
index XXXXXXX..XXXXXXX 100644
16
--- a/hw/net/net_tx_pkt.c
22
--- a/hw/virtio/vhost-vdpa.c
17
+++ b/hw/net/net_tx_pkt.c
23
+++ b/hw/virtio/vhost-vdpa.c
18
@@ -XXX,XX +XXX,XX @@ static inline void net_tx_pkt_sendv(struct NetTxPkt *pkt,
24
@@ -XXX,XX +XXX,XX @@ static int vhost_vdpa_set_vring_base(struct vhost_dev *dev,
19
NetClientState *nc, const struct iovec *iov, int iov_cnt)
25
static int vhost_vdpa_get_vring_base(struct vhost_dev *dev,
26
struct vhost_vring_state *ring)
20
{
27
{
21
if (pkt->is_loopback) {
28
+ struct vhost_vdpa *v = dev->opaque;
22
- nc->info->receive_iov(nc, iov, iov_cnt);
29
int ret;
23
+ qemu_receive_packet_iov(nc, iov, iov_cnt);
30
24
} else {
31
+ if (v->shadow_vqs_enabled) {
25
qemu_sendv_packet(nc, iov, iov_cnt);
32
+ VhostShadowVirtqueue *svq = g_ptr_array_index(v->shadow_vqs,
26
}
33
+ ring->index);
34
+
35
+ /*
36
+ * Setting base as last used idx, so destination will see as available
37
+ * all the entries that the device did not use, including the in-flight
38
+ * processing ones.
39
+ *
40
+ * TODO: This is ok for networking, but other kinds of devices might
41
+ * have problems with these retransmissions.
42
+ */
43
+ ring->num = svq->last_used_idx;
44
+ return 0;
45
+ }
46
+
47
ret = vhost_vdpa_call(dev, VHOST_GET_VRING_BASE, ring);
48
trace_vhost_vdpa_get_vring_base(dev, ring->index, ring->num);
49
return ret;
27
--
50
--
28
2.7.4
51
2.7.4
29
52
30
53
diff view generated by jsdifflib
1
During procss_tx_desc(), driver can try to chain data descriptor with
1
From: Eugenio Pérez <eperezma@redhat.com>
2
legacy descriptor, when will lead underflow for the following
3
calculation in process_tx_desc() for bytes:
4
2
5
if (tp->size + bytes > msh)
3
Setting the log address would make the device start reporting invalid
6
bytes = msh - tp->size;
4
dirty memory because the SVQ vrings are located in qemu's memory.
7
5
8
This will lead a infinite loop. So check and fail early if tp->size if
6
Acked-by: Michael S. Tsirkin <mst@redhat.com>
9
greater or equal to msh.
7
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
10
11
Reported-by: Alexander Bulekov <alxndr@bu.edu>
12
Reported-by: Cheolwoo Myung <cwmyung@snu.ac.kr>
13
Reported-by: Ruhr-University Bochum <bugs-syssec@rub.de>
14
Cc: Prasad J Pandit <ppandit@redhat.com>
15
Cc: qemu-stable@nongnu.org
16
Signed-off-by: Jason Wang <jasowang@redhat.com>
8
Signed-off-by: Jason Wang <jasowang@redhat.com>
17
---
9
---
18
hw/net/e1000.c | 4 ++++
10
hw/virtio/vhost-vdpa.c | 3 ++-
19
1 file changed, 4 insertions(+)
11
1 file changed, 2 insertions(+), 1 deletion(-)
20
12
21
diff --git a/hw/net/e1000.c b/hw/net/e1000.c
13
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
22
index XXXXXXX..XXXXXXX 100644
14
index XXXXXXX..XXXXXXX 100644
23
--- a/hw/net/e1000.c
15
--- a/hw/virtio/vhost-vdpa.c
24
+++ b/hw/net/e1000.c
16
+++ b/hw/virtio/vhost-vdpa.c
25
@@ -XXX,XX +XXX,XX @@ process_tx_desc(E1000State *s, struct e1000_tx_desc *dp)
17
@@ -XXX,XX +XXX,XX @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
26
msh = tp->tso_props.hdr_len + tp->tso_props.mss;
18
static int vhost_vdpa_set_log_base(struct vhost_dev *dev, uint64_t base,
27
do {
19
struct vhost_log *log)
28
bytes = split_size;
20
{
29
+ if (tp->size >= msh) {
21
- if (vhost_vdpa_one_time_request(dev)) {
30
+ goto eop;
22
+ struct vhost_vdpa *v = dev->opaque;
31
+ }
23
+ if (v->shadow_vqs_enabled || vhost_vdpa_one_time_request(dev)) {
32
if (tp->size + bytes > msh)
24
return 0;
33
bytes = msh - tp->size;
34
35
@@ -XXX,XX +XXX,XX @@ process_tx_desc(E1000State *s, struct e1000_tx_desc *dp)
36
tp->size += split_size;
37
}
25
}
38
26
39
+eop:
40
if (!(txd_lower & E1000_TXD_CMD_EOP))
41
return;
42
if (!(tp->cptse && tp->size < tp->tso_props.hdr_len)) {
43
--
27
--
44
2.7.4
28
2.7.4
45
29
46
30
diff view generated by jsdifflib
1
From: Paolo Bonzini <pbonzini@redhat.com>
1
From: Eugenio Pérez <eperezma@redhat.com>
2
2
3
When a network or network device is created from the command line or HMP,
3
SVQ is able to log the dirty bits by itself, so let's use it to not
4
QemuOpts ensures that the id passes the id_wellformed check. However,
4
block migration.
5
QMP skips this:
6
5
7
$ qemu-system-x86_64 -qmp stdio -S -nic user,id=123/456
6
Also, ignore set and clear of VHOST_F_LOG_ALL on set_features if SVQ is
8
qemu-system-x86_64: -nic user,id=123/456: Parameter id expects an identifier
7
enabled. Even if the device supports it, the reports would be nonsense
9
Identifiers consist of letters, digits, -, ., _, starting with a letter.
8
because SVQ memory is in the qemu region.
10
9
11
$ qemu-system-x86_64 -qmp stdio -S
10
The log region is still allocated. Future changes might skip that, but
12
{"execute":"qmp_capabilities"}
11
this series is already long enough.
13
{"return": {}}
14
{"execute":"netdev_add", "arguments": {"type": "user", "id": "123/456"}}
15
{"return": {}}
16
12
17
After:
13
Acked-by: Michael S. Tsirkin <mst@redhat.com>
18
14
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
19
$ qemu-system-x86_64 -qmp stdio -S
20
{"execute":"qmp_capabilities"}
21
{"return": {}}
22
{"execute":"netdev_add", "arguments": {"type": "user", "id": "123/456"}}
23
{"error": {"class": "GenericError", "desc": "Parameter "id" expects an identifier"}}
24
25
Validity checks should be performed always at the bottom of the call chain,
26
because QMP skips all the steps above. At the same time we know that every
27
call chain should go through either QMP or (for legacy) through QemuOpts.
28
Because the id for -net and -nic is automatically generated and not
29
well-formed by design, just add the check to QMP.
30
31
Cc: Jason Wang <jasowang@redhat.com>
32
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
33
Signed-off-by: Jason Wang <jasowang@redhat.com>
15
Signed-off-by: Jason Wang <jasowang@redhat.com>
34
---
16
---
35
net/net.c | 5 +++++
17
hw/virtio/vhost-vdpa.c | 39 +++++++++++++++++++++++++++++++++++----
36
1 file changed, 5 insertions(+)
18
include/hw/virtio/vhost-vdpa.h | 1 +
19
2 files changed, 36 insertions(+), 4 deletions(-)
37
20
38
diff --git a/net/net.c b/net/net.c
21
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
39
index XXXXXXX..XXXXXXX 100644
22
index XXXXXXX..XXXXXXX 100644
40
--- a/net/net.c
23
--- a/hw/virtio/vhost-vdpa.c
41
+++ b/net/net.c
24
+++ b/hw/virtio/vhost-vdpa.c
42
@@ -XXX,XX +XXX,XX @@ void netdev_add(QemuOpts *opts, Error **errp)
25
@@ -XXX,XX +XXX,XX @@ static bool vhost_vdpa_one_time_request(struct vhost_dev *dev)
43
26
return v->index != 0;
44
void qmp_netdev_add(Netdev *netdev, Error **errp)
27
}
28
29
+static int vhost_vdpa_get_dev_features(struct vhost_dev *dev,
30
+ uint64_t *features)
31
+{
32
+ int ret;
33
+
34
+ ret = vhost_vdpa_call(dev, VHOST_GET_FEATURES, features);
35
+ trace_vhost_vdpa_get_features(dev, *features);
36
+ return ret;
37
+}
38
+
39
static int vhost_vdpa_init_svq(struct vhost_dev *hdev, struct vhost_vdpa *v,
40
Error **errp)
45
{
41
{
46
+ if (!id_wellformed(netdev->id)) {
42
@@ -XXX,XX +XXX,XX @@ static int vhost_vdpa_init_svq(struct vhost_dev *hdev, struct vhost_vdpa *v,
47
+ error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "id", "an identifier");
43
return 0;
48
+ return;
44
}
45
46
- r = hdev->vhost_ops->vhost_get_features(hdev, &dev_features);
47
+ r = vhost_vdpa_get_dev_features(hdev, &dev_features);
48
if (r != 0) {
49
error_setg_errno(errp, -r, "Can't get vdpa device features");
50
return r;
51
@@ -XXX,XX +XXX,XX @@ static int vhost_vdpa_set_mem_table(struct vhost_dev *dev,
52
static int vhost_vdpa_set_features(struct vhost_dev *dev,
53
uint64_t features)
54
{
55
+ struct vhost_vdpa *v = dev->opaque;
56
int ret;
57
58
if (vhost_vdpa_one_time_request(dev)) {
59
return 0;
60
}
61
62
+ if (v->shadow_vqs_enabled) {
63
+ if ((v->acked_features ^ features) == BIT_ULL(VHOST_F_LOG_ALL)) {
64
+ /*
65
+ * QEMU is just trying to enable or disable logging. SVQ handles
66
+ * this sepparately, so no need to forward this.
67
+ */
68
+ v->acked_features = features;
69
+ return 0;
70
+ }
71
+
72
+ v->acked_features = features;
73
+
74
+ /* We must not ack _F_LOG if SVQ is enabled */
75
+ features &= ~BIT_ULL(VHOST_F_LOG_ALL);
49
+ }
76
+ }
50
+
77
+
51
net_client_init1(netdev, true, errp);
78
trace_vhost_vdpa_set_features(dev, features);
79
ret = vhost_vdpa_call(dev, VHOST_SET_FEATURES, &features);
80
if (ret) {
81
@@ -XXX,XX +XXX,XX @@ static int vhost_vdpa_set_vring_call(struct vhost_dev *dev,
82
static int vhost_vdpa_get_features(struct vhost_dev *dev,
83
uint64_t *features)
84
{
85
- int ret;
86
+ struct vhost_vdpa *v = dev->opaque;
87
+ int ret = vhost_vdpa_get_dev_features(dev, features);
88
+
89
+ if (ret == 0 && v->shadow_vqs_enabled) {
90
+ /* Add SVQ logging capabilities */
91
+ *features |= BIT_ULL(VHOST_F_LOG_ALL);
92
+ }
93
94
- ret = vhost_vdpa_call(dev, VHOST_GET_FEATURES, features);
95
- trace_vhost_vdpa_get_features(dev, *features);
96
return ret;
52
}
97
}
53
98
99
diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
100
index XXXXXXX..XXXXXXX 100644
101
--- a/include/hw/virtio/vhost-vdpa.h
102
+++ b/include/hw/virtio/vhost-vdpa.h
103
@@ -XXX,XX +XXX,XX @@ typedef struct vhost_vdpa {
104
bool iotlb_batch_begin_sent;
105
MemoryListener listener;
106
struct vhost_vdpa_iova_range iova_range;
107
+ uint64_t acked_features;
108
bool shadow_vqs_enabled;
109
/* IOVA mapping used by the Shadow Virtqueue */
110
VhostIOVATree *iova_tree;
54
--
111
--
55
2.7.4
112
2.7.4
56
113
57
114
diff view generated by jsdifflib
Deleted patch
1
This patch switches to use qemu_receive_packet() which can detect
2
reentrancy and return early.
3
1
4
This is intended to address CVE-2021-3416.
5
6
Cc: Prasad J Pandit <ppandit@redhat.com>
7
Cc: qemu-stable@nongnu.org
8
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
9
Signed-off-by: Jason Wang <jasowang@redhat.com>
10
---
11
hw/net/e1000.c | 2 +-
12
1 file changed, 1 insertion(+), 1 deletion(-)
13
14
diff --git a/hw/net/e1000.c b/hw/net/e1000.c
15
index XXXXXXX..XXXXXXX 100644
16
--- a/hw/net/e1000.c
17
+++ b/hw/net/e1000.c
18
@@ -XXX,XX +XXX,XX @@ e1000_send_packet(E1000State *s, const uint8_t *buf, int size)
19
20
NetClientState *nc = qemu_get_queue(s->nic);
21
if (s->phy_reg[PHY_CTRL] & MII_CR_LOOPBACK) {
22
- nc->info->receive(nc, buf, size);
23
+ qemu_receive_packet(nc, buf, size);
24
} else {
25
qemu_send_packet(nc, buf, size);
26
}
27
--
28
2.7.4
29
30
diff view generated by jsdifflib
Deleted patch
1
This patch switches to use qemu_receive_packet() which can detect
2
reentrancy and return early.
3
1
4
This is intended to address CVE-2021-3416.
5
6
Cc: Prasad J Pandit <ppandit@redhat.com>
7
Cc: qemu-stable@nongnu.org
8
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com
9
Signed-off-by: Jason Wang <jasowang@redhat.com>
10
---
11
hw/net/dp8393x.c | 2 +-
12
1 file changed, 1 insertion(+), 1 deletion(-)
13
14
diff --git a/hw/net/dp8393x.c b/hw/net/dp8393x.c
15
index XXXXXXX..XXXXXXX 100644
16
--- a/hw/net/dp8393x.c
17
+++ b/hw/net/dp8393x.c
18
@@ -XXX,XX +XXX,XX @@ static void dp8393x_do_transmit_packets(dp8393xState *s)
19
s->regs[SONIC_TCR] |= SONIC_TCR_CRSL;
20
if (nc->info->can_receive(nc)) {
21
s->loopback_packet = 1;
22
- nc->info->receive(nc, s->tx_buffer, tx_len);
23
+ qemu_receive_packet(nc, s->tx_buffer, tx_len);
24
}
25
} else {
26
/* Transmit packet */
27
--
28
2.7.4
29
30
diff view generated by jsdifflib
Deleted patch
1
This patch switches to use qemu_receive_packet() which can detect
2
reentrancy and return early.
3
1
4
This is intended to address CVE-2021-3416.
5
6
Cc: Prasad J Pandit <ppandit@redhat.com>
7
Cc: qemu-stable@nongnu.org
8
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
9
Signed-off-by: Jason Wang <jasowang@redhat.com>
10
---
11
hw/net/msf2-emac.c | 2 +-
12
1 file changed, 1 insertion(+), 1 deletion(-)
13
14
diff --git a/hw/net/msf2-emac.c b/hw/net/msf2-emac.c
15
index XXXXXXX..XXXXXXX 100644
16
--- a/hw/net/msf2-emac.c
17
+++ b/hw/net/msf2-emac.c
18
@@ -XXX,XX +XXX,XX @@ static void msf2_dma_tx(MSF2EmacState *s)
19
* R_CFG1 bit 0 is set.
20
*/
21
if (s->regs[R_CFG1] & R_CFG1_LB_EN_MASK) {
22
- nc->info->receive(nc, buf, size);
23
+ qemu_receive_packet(nc, buf, size);
24
} else {
25
qemu_send_packet(nc, buf, size);
26
}
27
--
28
2.7.4
29
30
diff view generated by jsdifflib
Deleted patch
1
This patch switches to use qemu_receive_packet() which can detect
2
reentrancy and return early.
3
1
4
This is intended to address CVE-2021-3416.
5
6
Cc: Prasad J Pandit <ppandit@redhat.com>
7
Cc: qemu-stable@nongnu.org
8
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
9
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
10
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
11
Signed-off-by: Jason Wang <jasowang@redhat.com>
12
---
13
hw/net/sungem.c | 2 +-
14
1 file changed, 1 insertion(+), 1 deletion(-)
15
16
diff --git a/hw/net/sungem.c b/hw/net/sungem.c
17
index XXXXXXX..XXXXXXX 100644
18
--- a/hw/net/sungem.c
19
+++ b/hw/net/sungem.c
20
@@ -XXX,XX +XXX,XX @@ static void sungem_send_packet(SunGEMState *s, const uint8_t *buf,
21
NetClientState *nc = qemu_get_queue(s->nic);
22
23
if (s->macregs[MAC_XIFCFG >> 2] & MAC_XIFCFG_LBCK) {
24
- nc->info->receive(nc, buf, size);
25
+ qemu_receive_packet(nc, buf, size);
26
} else {
27
qemu_send_packet(nc, buf, size);
28
}
29
--
30
2.7.4
31
32
diff view generated by jsdifflib