1
This series provides an asynchronous means of reporting free guest pages
1
This series provides an asynchronous means of reporting free guest pages
2
to QEMU through virtio-balloon so that the memory associated with those
2
to QEMU through virtio-balloon so that the memory associated with those
3
pages can be dropped and reused by other processes and/or guests on the
3
pages can be dropped and reused by other processes and/or guests on the
4
host. Using this it is possible to avoid unnecessary I/O to disk and
4
host. Using this it is possible to avoid unnecessary I/O to disk and
5
greatly improve performance in the case of memory overcommit on the host.
5
greatly improve performance in the case of memory overcommit on the host.
6
6
7
I originally submitted this patch series back on February 11th 2020[1],
7
As of April 7th this functionality has been enabled in Linus's kernel
8
but at that time I was focused primarily on the kernel portion of this
8
tree[1] and so I am submitting the QEMU pieces for inclusion.
9
patch set. However as of April 7th those patches are now included in
10
Linus's kernel tree[2] and so I am submitting the QEMU pieces for
11
inclusion.
12
9
13
[1]: https://lore.kernel.org/lkml/20200211224416.29318.44077.stgit@localhost.localdomain/
10
[1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b0c504f154718904ae49349147e3b7e6ae91ffdc
14
[2]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b0c504f154718904ae49349147e3b7e6ae91ffdc
15
11
16
Changes from v17:
12
Changes from v17:
17
Fixed typo in patch 1 title
13
Fixed typo in patch 1 title
18
Addressed white-space issues reported via checkpatch
14
Addressed white-space issues reported via checkpatch
19
Added braces {} for two if statements to match expected coding style
15
Added braces {} for two if statements to match expected coding style
20
16
21
Changes from v18:
17
Changes from v18:
22
Updated patches 2 and 3 based on input from dhildenb
18
Updated patches 2 and 3 based on input from dhildenb
23
Added comment to patch 2 describing what keeps us from reporting a bad page
19
Added comment to patch 2 describing what keeps us from reporting a bad page
24
Added patch to address issue with ROM devices being directly writable
20
Added patch to address issue with ROM devices being directly writable
25
21
22
Changes from v19:
23
Added std-headers change to match changes pushed for linux kernel headers
24
Added patch to remove "report" from page hinting code paths
25
Updated comment to better explain why we disable hints w/ page poisoning
26
Removed code that was modifying config size for poison vs hinting
27
Dropped x-page-poison property
28
Added code to bounds check the reported region vs the RAM block
29
Dropped patch for ROM devices as that was already pulled in by Paolo
30
31
Changes from v20:
32
Rearranged patches to push Linux header sync patches to front
33
Removed association between free page hinting and VIRTIO_BALLOON_F_PAGE_POISON
34
Added code to enable VIRTIO_BALLOON_F_PAGE_POISON if page reporting is enabled
35
Fixed possible resource leak if poison or qemu_balloon_is_inhibited return true
36
Updated cover page comments
37
26
---
38
---
27
39
28
Alexander Duyck (4):
40
Alexander Duyck (5):
41
linux-headers: Update to allow renaming of free_page_report_cmd_id
42
linux-headers: update to contain virito-balloon free page reporting
43
virtio-balloon: Replace free page hinting references to 'report' with 'hint'
29
virtio-balloon: Implement support for page poison tracking feature
44
virtio-balloon: Implement support for page poison tracking feature
30
linux-headers: update to contain virito-balloon free page reporting
31
virtio-balloon: Provide an interface for free page reporting
45
virtio-balloon: Provide an interface for free page reporting
32
memory: Do not allow direct write access to rom_device regions
33
46
34
47
35
hw/virtio/virtio-balloon.c | 85 ++++++++++++++++++++++-
48
hw/virtio/virtio-balloon.c | 151 +++++++++++++++++------
36
include/exec/memory.h | 4 +
49
include/hw/virtio/virtio-balloon.h | 23 ++--
37
include/hw/virtio/virtio-balloon.h | 3 +
50
include/standard-headers/linux/virtio_balloon.h | 12 ++
38
include/standard-headers/linux/virtio_balloon.h | 1
51
3 files changed, 136 insertions(+), 50 deletions(-)
39
4 files changed, 86 insertions(+), 7 deletions(-)
40
52
41
--
53
--
42
54
diff view generated by jsdifflib
1
From: Alexander Duyck <alexander.h.duyck@linux.intel.com>
1
From: Alexander Duyck <alexander.h.duyck@linux.intel.com>
2
2
3
According to the documentation in memory.h a ROM memory region will be
3
Sync to the latest upstream changes for free page hinting. To be
4
backed by RAM for reads, but is supposed to go through a callback for
4
replaced by a full linux header sync.
5
writes. Currently we were not checking for the existence of the rom_device
6
flag when determining if we could perform a direct write or not.
7
8
To correct that add a check to memory_region_is_direct so that if the
9
memory region has the rom_device flag set we will return false for all
10
checks where is_write is set.
11
5
12
Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
6
Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
13
---
7
---
14
include/exec/memory.h | 4 ++--
8
include/standard-headers/linux/virtio_balloon.h | 11 +++++++++--
15
1 file changed, 2 insertions(+), 2 deletions(-)
9
1 file changed, 9 insertions(+), 2 deletions(-)
16
10
17
diff --git a/include/exec/memory.h b/include/exec/memory.h
11
diff --git a/include/standard-headers/linux/virtio_balloon.h b/include/standard-headers/linux/virtio_balloon.h
18
index XXXXXXX..XXXXXXX 100644
12
index XXXXXXX..XXXXXXX 100644
19
--- a/include/exec/memory.h
13
--- a/include/standard-headers/linux/virtio_balloon.h
20
+++ b/include/exec/memory.h
14
+++ b/include/standard-headers/linux/virtio_balloon.h
21
@@ -XXX,XX +XXX,XX @@ void address_space_write_cached_slow(MemoryRegionCache *cache,
15
@@ -XXX,XX +XXX,XX @@ struct virtio_balloon_config {
22
static inline bool memory_access_is_direct(MemoryRegion *mr, bool is_write)
16
    uint32_t num_pages;
23
{
17
    /* Number of pages we've actually got in balloon. */
24
if (is_write) {
18
    uint32_t actual;
25
- return memory_region_is_ram(mr) &&
19
-    /* Free page report command id, readonly by guest */
26
- !mr->readonly && !memory_region_is_ram_device(mr);
20
-    uint32_t free_page_report_cmd_id;
27
+ return memory_region_is_ram(mr) && !mr->readonly &&
21
+    /*
28
+ !mr->rom_device && !memory_region_is_ram_device(mr);
22
+     * Free page hint command id, readonly by guest.
29
} else {
23
+     * Was previously name free_page_report_cmd_id so we
30
return (memory_region_is_ram(mr) && !memory_region_is_ram_device(mr)) ||
24
+     * need to carry that name for legacy support.
31
memory_region_is_romd(mr);
25
+     */
26
+    union {
27
+        uint32_t free_page_hint_cmd_id;
28
+        uint32_t free_page_report_cmd_id;    /* deprecated */
29
+    };
30
    /* Stores PAGE_POISON if page poisoning is in use */
31
    uint32_t poison_val;
32
};
32
33
33
34
diff view generated by jsdifflib
1
From: Alexander Duyck <alexander.h.duyck@linux.intel.com>
1
From: Alexander Duyck <alexander.h.duyck@linux.intel.com>
2
2
3
Sync the latest upstream changes for free page reporting. To be
3
Sync the latest upstream changes for free page reporting. To be
4
replaced by a full linux header sync.
4
replaced by a full linux header sync.
5
5
6
Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
6
Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
7
---
7
---
8
include/standard-headers/linux/virtio_balloon.h | 1 +
8
include/standard-headers/linux/virtio_balloon.h | 1 +
9
1 file changed, 1 insertion(+)
9
1 file changed, 1 insertion(+)
10
10
11
diff --git a/include/standard-headers/linux/virtio_balloon.h b/include/standard-headers/linux/virtio_balloon.h
11
diff --git a/include/standard-headers/linux/virtio_balloon.h b/include/standard-headers/linux/virtio_balloon.h
12
index XXXXXXX..XXXXXXX 100644
12
index XXXXXXX..XXXXXXX 100644
13
--- a/include/standard-headers/linux/virtio_balloon.h
13
--- a/include/standard-headers/linux/virtio_balloon.h
14
+++ b/include/standard-headers/linux/virtio_balloon.h
14
+++ b/include/standard-headers/linux/virtio_balloon.h
15
@@ -XXX,XX +XXX,XX @@
15
@@ -XXX,XX +XXX,XX @@
16
#define VIRTIO_BALLOON_F_DEFLATE_ON_OOM    2 /* Deflate balloon on OOM */
16
#define VIRTIO_BALLOON_F_DEFLATE_ON_OOM    2 /* Deflate balloon on OOM */
17
#define VIRTIO_BALLOON_F_FREE_PAGE_HINT    3 /* VQ to report free pages */
17
#define VIRTIO_BALLOON_F_FREE_PAGE_HINT    3 /* VQ to report free pages */
18
#define VIRTIO_BALLOON_F_PAGE_POISON    4 /* Guest is using page poisoning */
18
#define VIRTIO_BALLOON_F_PAGE_POISON    4 /* Guest is using page poisoning */
19
+#define VIRTIO_BALLOON_F_REPORTING    5 /* Page reporting virtqueue */
19
+#define VIRTIO_BALLOON_F_REPORTING    5 /* Page reporting virtqueue */
20
20
21
/* Size of a PFN in the balloon interface. */
21
/* Size of a PFN in the balloon interface. */
22
#define VIRTIO_BALLOON_PFN_SHIFT 12
22
#define VIRTIO_BALLOON_PFN_SHIFT 12
23
23
24
24
diff view generated by jsdifflib
New patch
1
1
From: Alexander Duyck <alexander.h.duyck@linux.intel.com>
2
3
In an upcoming patch a feature named Free Page Reporting is about to be
4
added. In order to avoid any confusion we should drop the use of the word
5
'report' when referring to Free Page Hinting. So what this patch does is go
6
through and replace all instances of 'report' with 'hint" when we are
7
referring to free page hinting.
8
9
Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
10
---
11
hw/virtio/virtio-balloon.c | 74 ++++++++++++++++++------------------
12
include/hw/virtio/virtio-balloon.h | 20 +++++-----
13
2 files changed, 47 insertions(+), 47 deletions(-)
14
15
diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
16
index XXXXXXX..XXXXXXX 100644
17
--- a/hw/virtio/virtio-balloon.c
18
+++ b/hw/virtio/virtio-balloon.c
19
@@ -XXX,XX +XXX,XX @@ static bool get_free_page_hints(VirtIOBalloon *dev)
20
ret = false;
21
goto out;
22
}
23
- if (id == dev->free_page_report_cmd_id) {
24
- dev->free_page_report_status = FREE_PAGE_REPORT_S_START;
25
+ if (id == dev->free_page_hint_cmd_id) {
26
+ dev->free_page_hint_status = FREE_PAGE_HINT_S_START;
27
} else {
28
/*
29
* Stop the optimization only when it has started. This
30
* avoids a stale stop sign for the previous command.
31
*/
32
- if (dev->free_page_report_status == FREE_PAGE_REPORT_S_START) {
33
- dev->free_page_report_status = FREE_PAGE_REPORT_S_STOP;
34
+ if (dev->free_page_hint_status == FREE_PAGE_HINT_S_START) {
35
+ dev->free_page_hint_status = FREE_PAGE_HINT_S_STOP;
36
}
37
}
38
}
39
40
if (elem->in_num) {
41
- if (dev->free_page_report_status == FREE_PAGE_REPORT_S_START) {
42
+ if (dev->free_page_hint_status == FREE_PAGE_HINT_S_START) {
43
qemu_guest_free_page_hint(elem->in_sg[0].iov_base,
44
elem->in_sg[0].iov_len);
45
}
46
@@ -XXX,XX +XXX,XX @@ static void virtio_ballloon_get_free_page_hints(void *opaque)
47
qemu_mutex_unlock(&dev->free_page_lock);
48
virtio_notify(vdev, vq);
49
/*
50
- * Start to poll the vq once the reporting started. Otherwise, continue
51
+ * Start to poll the vq once the hinting started. Otherwise, continue
52
* only when there are entries on the vq, which need to be given back.
53
*/
54
} while (continue_to_get_hints ||
55
- dev->free_page_report_status == FREE_PAGE_REPORT_S_START);
56
+ dev->free_page_hint_status == FREE_PAGE_HINT_S_START);
57
virtio_queue_set_notification(vq, 1);
58
}
59
60
@@ -XXX,XX +XXX,XX @@ static void virtio_balloon_free_page_start(VirtIOBalloon *s)
61
return;
62
}
63
64
- if (s->free_page_report_cmd_id == UINT_MAX) {
65
- s->free_page_report_cmd_id =
66
- VIRTIO_BALLOON_FREE_PAGE_REPORT_CMD_ID_MIN;
67
+ if (s->free_page_hint_cmd_id == UINT_MAX) {
68
+ s->free_page_hint_cmd_id =
69
+ VIRTIO_BALLOON_FREE_PAGE_HINT_CMD_ID_MIN;
70
} else {
71
- s->free_page_report_cmd_id++;
72
+ s->free_page_hint_cmd_id++;
73
}
74
75
- s->free_page_report_status = FREE_PAGE_REPORT_S_REQUESTED;
76
+ s->free_page_hint_status = FREE_PAGE_HINT_S_REQUESTED;
77
virtio_notify_config(vdev);
78
}
79
80
@@ -XXX,XX +XXX,XX @@ static void virtio_balloon_free_page_stop(VirtIOBalloon *s)
81
{
82
VirtIODevice *vdev = VIRTIO_DEVICE(s);
83
84
- if (s->free_page_report_status != FREE_PAGE_REPORT_S_STOP) {
85
+ if (s->free_page_hint_status != FREE_PAGE_HINT_S_STOP) {
86
/*
87
* The lock also guarantees us that the
88
* virtio_ballloon_get_free_page_hints exits after the
89
- * free_page_report_status is set to S_STOP.
90
+ * free_page_hint_status is set to S_STOP.
91
*/
92
qemu_mutex_lock(&s->free_page_lock);
93
/*
94
* The guest hasn't done the reporting, so host sends a notification
95
* to the guest to actively stop the reporting.
96
*/
97
- s->free_page_report_status = FREE_PAGE_REPORT_S_STOP;
98
+ s->free_page_hint_status = FREE_PAGE_HINT_S_STOP;
99
qemu_mutex_unlock(&s->free_page_lock);
100
virtio_notify_config(vdev);
101
}
102
@@ -XXX,XX +XXX,XX @@ static void virtio_balloon_free_page_done(VirtIOBalloon *s)
103
{
104
VirtIODevice *vdev = VIRTIO_DEVICE(s);
105
106
- s->free_page_report_status = FREE_PAGE_REPORT_S_DONE;
107
+ s->free_page_hint_status = FREE_PAGE_HINT_S_DONE;
108
virtio_notify_config(vdev);
109
}
110
111
static int
112
-virtio_balloon_free_page_report_notify(NotifierWithReturn *n, void *data)
113
+virtio_balloon_free_page_hint_notify(NotifierWithReturn *n, void *data)
114
{
115
VirtIOBalloon *dev = container_of(n, VirtIOBalloon,
116
- free_page_report_notify);
117
+ free_page_hint_notify);
118
VirtIODevice *vdev = VIRTIO_DEVICE(dev);
119
PrecopyNotifyData *pnd = data;
120
121
@@ -XXX,XX +XXX,XX @@ static size_t virtio_balloon_config_size(VirtIOBalloon *s)
122
if (virtio_has_feature(features, VIRTIO_BALLOON_F_FREE_PAGE_HINT)) {
123
return offsetof(struct virtio_balloon_config, poison_val);
124
}
125
- return offsetof(struct virtio_balloon_config, free_page_report_cmd_id);
126
+ return offsetof(struct virtio_balloon_config, free_page_hint_cmd_id);
127
}
128
129
static void virtio_balloon_get_config(VirtIODevice *vdev, uint8_t *config_data)
130
@@ -XXX,XX +XXX,XX @@ static void virtio_balloon_get_config(VirtIODevice *vdev, uint8_t *config_data)
131
config.num_pages = cpu_to_le32(dev->num_pages);
132
config.actual = cpu_to_le32(dev->actual);
133
134
- if (dev->free_page_report_status == FREE_PAGE_REPORT_S_REQUESTED) {
135
- config.free_page_report_cmd_id =
136
- cpu_to_le32(dev->free_page_report_cmd_id);
137
- } else if (dev->free_page_report_status == FREE_PAGE_REPORT_S_STOP) {
138
- config.free_page_report_cmd_id =
139
+ if (dev->free_page_hint_status == FREE_PAGE_HINT_S_REQUESTED) {
140
+ config.free_page_hint_cmd_id =
141
+ cpu_to_le32(dev->free_page_hint_cmd_id);
142
+ } else if (dev->free_page_hint_status == FREE_PAGE_HINT_S_STOP) {
143
+ config.free_page_hint_cmd_id =
144
cpu_to_le32(VIRTIO_BALLOON_CMD_ID_STOP);
145
- } else if (dev->free_page_report_status == FREE_PAGE_REPORT_S_DONE) {
146
- config.free_page_report_cmd_id =
147
+ } else if (dev->free_page_hint_status == FREE_PAGE_HINT_S_DONE) {
148
+ config.free_page_hint_cmd_id =
149
cpu_to_le32(VIRTIO_BALLOON_CMD_ID_DONE);
150
}
151
152
@@ -XXX,XX +XXX,XX @@ static int virtio_balloon_post_load_device(void *opaque, int version_id)
153
return 0;
154
}
155
156
-static const VMStateDescription vmstate_virtio_balloon_free_page_report = {
157
+static const VMStateDescription vmstate_virtio_balloon_free_page_hint = {
158
.name = "virtio-balloon-device/free-page-report",
159
.version_id = 1,
160
.minimum_version_id = 1,
161
.needed = virtio_balloon_free_page_support,
162
.fields = (VMStateField[]) {
163
- VMSTATE_UINT32(free_page_report_cmd_id, VirtIOBalloon),
164
- VMSTATE_UINT32(free_page_report_status, VirtIOBalloon),
165
+ VMSTATE_UINT32(free_page_hint_cmd_id, VirtIOBalloon),
166
+ VMSTATE_UINT32(free_page_hint_status, VirtIOBalloon),
167
VMSTATE_END_OF_LIST()
168
}
169
};
170
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_virtio_balloon_device = {
171
VMSTATE_END_OF_LIST()
172
},
173
.subsections = (const VMStateDescription * []) {
174
- &vmstate_virtio_balloon_free_page_report,
175
+ &vmstate_virtio_balloon_free_page_hint,
176
NULL
177
}
178
};
179
@@ -XXX,XX +XXX,XX @@ static void virtio_balloon_device_realize(DeviceState *dev, Error **errp)
180
VIRTIO_BALLOON_F_FREE_PAGE_HINT)) {
181
s->free_page_vq = virtio_add_queue(vdev, VIRTQUEUE_MAX_SIZE,
182
virtio_balloon_handle_free_page_vq);
183
- s->free_page_report_status = FREE_PAGE_REPORT_S_STOP;
184
- s->free_page_report_cmd_id =
185
- VIRTIO_BALLOON_FREE_PAGE_REPORT_CMD_ID_MIN;
186
- s->free_page_report_notify.notify =
187
- virtio_balloon_free_page_report_notify;
188
- precopy_add_notifier(&s->free_page_report_notify);
189
+ s->free_page_hint_status = FREE_PAGE_HINT_S_STOP;
190
+ s->free_page_hint_cmd_id =
191
+ VIRTIO_BALLOON_FREE_PAGE_HINT_CMD_ID_MIN;
192
+ s->free_page_hint_notify.notify =
193
+ virtio_balloon_free_page_hint_notify;
194
+ precopy_add_notifier(&s->free_page_hint_notify);
195
if (s->iothread) {
196
object_ref(OBJECT(s->iothread));
197
s->free_page_bh = aio_bh_new(iothread_get_aio_context(s->iothread),
198
@@ -XXX,XX +XXX,XX @@ static void virtio_balloon_device_unrealize(DeviceState *dev, Error **errp)
199
if (virtio_balloon_free_page_support(s)) {
200
qemu_bh_delete(s->free_page_bh);
201
virtio_balloon_free_page_stop(s);
202
- precopy_remove_notifier(&s->free_page_report_notify);
203
+ precopy_remove_notifier(&s->free_page_hint_notify);
204
}
205
balloon_stats_destroy_timer(s);
206
qemu_remove_balloon_handler(s);
207
diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-balloon.h
208
index XXXXXXX..XXXXXXX 100644
209
--- a/include/hw/virtio/virtio-balloon.h
210
+++ b/include/hw/virtio/virtio-balloon.h
211
@@ -XXX,XX +XXX,XX @@
212
#define VIRTIO_BALLOON(obj) \
213
OBJECT_CHECK(VirtIOBalloon, (obj), TYPE_VIRTIO_BALLOON)
214
215
-#define VIRTIO_BALLOON_FREE_PAGE_REPORT_CMD_ID_MIN 0x80000000
216
+#define VIRTIO_BALLOON_FREE_PAGE_HINT_CMD_ID_MIN 0x80000000
217
218
typedef struct virtio_balloon_stat VirtIOBalloonStat;
219
220
@@ -XXX,XX +XXX,XX @@ typedef struct virtio_balloon_stat_modern {
221
uint64_t val;
222
} VirtIOBalloonStatModern;
223
224
-enum virtio_balloon_free_page_report_status {
225
- FREE_PAGE_REPORT_S_STOP = 0,
226
- FREE_PAGE_REPORT_S_REQUESTED = 1,
227
- FREE_PAGE_REPORT_S_START = 2,
228
- FREE_PAGE_REPORT_S_DONE = 3,
229
+enum virtio_balloon_free_page_hint_status {
230
+ FREE_PAGE_HINT_S_STOP = 0,
231
+ FREE_PAGE_HINT_S_REQUESTED = 1,
232
+ FREE_PAGE_HINT_S_START = 2,
233
+ FREE_PAGE_HINT_S_DONE = 3,
234
};
235
236
typedef struct VirtIOBalloon {
237
VirtIODevice parent_obj;
238
VirtQueue *ivq, *dvq, *svq, *free_page_vq;
239
- uint32_t free_page_report_status;
240
+ uint32_t free_page_hint_status;
241
uint32_t num_pages;
242
uint32_t actual;
243
- uint32_t free_page_report_cmd_id;
244
+ uint32_t free_page_hint_cmd_id;
245
uint64_t stats[VIRTIO_BALLOON_S_NR];
246
VirtQueueElement *stats_vq_elem;
247
size_t stats_vq_offset;
248
@@ -XXX,XX +XXX,XX @@ typedef struct VirtIOBalloon {
249
QEMUBH *free_page_bh;
250
/*
251
* Lock to synchronize threads to access the free page reporting related
252
- * fields (e.g. free_page_report_status).
253
+ * fields (e.g. free_page_hint_status).
254
*/
255
QemuMutex free_page_lock;
256
QemuCond free_page_cond;
257
@@ -XXX,XX +XXX,XX @@ typedef struct VirtIOBalloon {
258
* stopped.
259
*/
260
bool block_iothread;
261
- NotifierWithReturn free_page_report_notify;
262
+ NotifierWithReturn free_page_hint_notify;
263
int64_t stats_last_update;
264
int64_t stats_poll_interval;
265
uint32_t host_features;
266
267
diff view generated by jsdifflib
1
From: Alexander Duyck <alexander.h.duyck@linux.intel.com>
1
From: Alexander Duyck <alexander.h.duyck@linux.intel.com>
2
2
3
We need to make certain to advertise support for page poison tracking if
3
We need to make certain to advertise support for page poison tracking if
4
we want to actually get data on if the guest will be poisoning pages. So
4
we want to actually get data on if the guest will be poisoning pages.
5
if free page hinting is active we should add page poisoning support and
6
let the guest disable it if it isn't using it.
7
5
8
Page poisoning will result in a page being dirtied on free. As such we
6
Add a value for tracking the poison value being used if page poisoning is
9
cannot really avoid having to copy the page at least one more time since
7
enabled. With this we can determine if we will need to skip page reporting
10
we will need to write the poison value to the destination. As such we can
8
when it is enabled in the future.
11
just ignore free page hinting if page poisoning is enabled as it will
12
actually reduce the work we have to do.
13
9
14
Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
10
Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
15
---
11
---
16
hw/virtio/virtio-balloon.c | 26 ++++++++++++++++++++++----
12
hw/virtio/virtio-balloon.c | 7 +++++++
17
include/hw/virtio/virtio-balloon.h | 1 +
13
include/hw/virtio/virtio-balloon.h | 1 +
18
2 files changed, 23 insertions(+), 4 deletions(-)
14
2 files changed, 8 insertions(+)
19
15
20
diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
16
diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
21
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
22
--- a/hw/virtio/virtio-balloon.c
18
--- a/hw/virtio/virtio-balloon.c
23
+++ b/hw/virtio/virtio-balloon.c
19
+++ b/hw/virtio/virtio-balloon.c
24
@@ -XXX,XX +XXX,XX @@ static void virtio_balloon_free_page_start(VirtIOBalloon *s)
25
return;
26
}
27
28
+ /*
29
+ * If page poisoning is enabled then we probably shouldn't bother with
30
+ * the hinting since the poisoning will dirty the page and invalidate
31
+ * the work we are doing anyway.
32
+ */
33
+ if (virtio_vdev_has_feature(vdev, VIRTIO_BALLOON_F_PAGE_POISON)) {
34
+ return;
35
+ }
36
+
37
if (s->free_page_report_cmd_id == UINT_MAX) {
38
s->free_page_report_cmd_id =
39
VIRTIO_BALLOON_FREE_PAGE_REPORT_CMD_ID_MIN;
40
@@ -XXX,XX +XXX,XX @@ static size_t virtio_balloon_config_size(VirtIOBalloon *s)
41
if (s->qemu_4_0_config_size) {
42
return sizeof(struct virtio_balloon_config);
43
}
44
- if (virtio_has_feature(features, VIRTIO_BALLOON_F_PAGE_POISON)) {
45
+ if (virtio_has_feature(features, VIRTIO_BALLOON_F_PAGE_POISON) ||
46
+ virtio_has_feature(features, VIRTIO_BALLOON_F_FREE_PAGE_HINT)) {
47
return sizeof(struct virtio_balloon_config);
48
}
49
- if (virtio_has_feature(features, VIRTIO_BALLOON_F_FREE_PAGE_HINT)) {
50
- return offsetof(struct virtio_balloon_config, poison_val);
51
- }
52
return offsetof(struct virtio_balloon_config, free_page_report_cmd_id);
53
}
54
55
@@ -XXX,XX +XXX,XX @@ static void virtio_balloon_get_config(VirtIODevice *vdev, uint8_t *config_data)
20
@@ -XXX,XX +XXX,XX @@ static void virtio_balloon_get_config(VirtIODevice *vdev, uint8_t *config_data)
56
21
57
config.num_pages = cpu_to_le32(dev->num_pages);
22
config.num_pages = cpu_to_le32(dev->num_pages);
58
config.actual = cpu_to_le32(dev->actual);
23
config.actual = cpu_to_le32(dev->actual);
59
+ config.poison_val = cpu_to_le32(dev->poison_val);
24
+ config.poison_val = cpu_to_le32(dev->poison_val);
60
25
61
if (dev->free_page_report_status == FREE_PAGE_REPORT_S_REQUESTED) {
26
if (dev->free_page_hint_status == FREE_PAGE_HINT_S_REQUESTED) {
62
config.free_page_report_cmd_id =
27
config.free_page_hint_cmd_id =
63
@@ -XXX,XX +XXX,XX @@ static void virtio_balloon_set_config(VirtIODevice *vdev,
28
@@ -XXX,XX +XXX,XX @@ static void virtio_balloon_set_config(VirtIODevice *vdev,
64
qapi_event_send_balloon_change(vm_ram_size -
29
qapi_event_send_balloon_change(vm_ram_size -
65
((ram_addr_t) dev->actual << VIRTIO_BALLOON_PFN_SHIFT));
30
((ram_addr_t) dev->actual << VIRTIO_BALLOON_PFN_SHIFT));
66
}
31
}
67
+ dev->poison_val = virtio_vdev_has_feature(vdev,
32
+ dev->poison_val = 0;
68
+ VIRTIO_BALLOON_F_PAGE_POISON) ?
33
+ if (virtio_vdev_has_feature(vdev, VIRTIO_BALLOON_F_PAGE_POISON)) {
69
+ le32_to_cpu(config.poison_val) : 0;
34
+ dev->poison_val = le32_to_cpu(config.poison_val);
35
+ }
70
trace_virtio_balloon_set_config(dev->actual, oldactual);
36
trace_virtio_balloon_set_config(dev->actual, oldactual);
71
}
37
}
72
38
73
@@ -XXX,XX +XXX,XX @@ static uint64_t virtio_balloon_get_features(VirtIODevice *vdev, uint64_t f,
74
VirtIOBalloon *dev = VIRTIO_BALLOON(vdev);
75
f |= dev->host_features;
76
virtio_add_feature(&f, VIRTIO_BALLOON_F_STATS_VQ);
77
+ if (virtio_has_feature(f, VIRTIO_BALLOON_F_FREE_PAGE_HINT)) {
78
+ virtio_add_feature(&f, VIRTIO_BALLOON_F_PAGE_POISON);
79
+ }
80
81
return f;
82
}
83
@@ -XXX,XX +XXX,XX @@ static void virtio_balloon_device_reset(VirtIODevice *vdev)
39
@@ -XXX,XX +XXX,XX @@ static void virtio_balloon_device_reset(VirtIODevice *vdev)
84
g_free(s->stats_vq_elem);
40
g_free(s->stats_vq_elem);
85
s->stats_vq_elem = NULL;
41
s->stats_vq_elem = NULL;
86
}
42
}
87
+
43
+
88
+ s->poison_val = 0;
44
+ s->poison_val = 0;
89
}
45
}
90
46
91
static void virtio_balloon_set_status(VirtIODevice *vdev, uint8_t status)
47
static void virtio_balloon_set_status(VirtIODevice *vdev, uint8_t status)
92
@@ -XXX,XX +XXX,XX @@ static Property virtio_balloon_properties[] = {
93
VIRTIO_BALLOON_F_DEFLATE_ON_OOM, false),
94
DEFINE_PROP_BIT("free-page-hint", VirtIOBalloon, host_features,
95
VIRTIO_BALLOON_F_FREE_PAGE_HINT, false),
96
+ DEFINE_PROP_BIT("x-page-poison", VirtIOBalloon, host_features,
97
+ VIRTIO_BALLOON_F_PAGE_POISON, false),
98
/* QEMU 4.0 accidentally changed the config size even when free-page-hint
99
* is disabled, resulting in QEMU 3.1 migration incompatibility. This
100
* property retains this quirk for QEMU 4.1 machine types.
101
diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-balloon.h
48
diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-balloon.h
102
index XXXXXXX..XXXXXXX 100644
49
index XXXXXXX..XXXXXXX 100644
103
--- a/include/hw/virtio/virtio-balloon.h
50
--- a/include/hw/virtio/virtio-balloon.h
104
+++ b/include/hw/virtio/virtio-balloon.h
51
+++ b/include/hw/virtio/virtio-balloon.h
105
@@ -XXX,XX +XXX,XX @@ typedef struct VirtIOBalloon {
52
@@ -XXX,XX +XXX,XX @@ typedef struct VirtIOBalloon {
...
...
diff view generated by jsdifflib
...
...
10
pages to the hypervisor, so the hypervisor can reuse them. In contrast to
10
pages to the hypervisor, so the hypervisor can reuse them. In contrast to
11
inflate/deflate that is triggered via the hypervisor explicitly.
11
inflate/deflate that is triggered via the hypervisor explicitly.
12
12
13
Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
13
Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
14
---
14
---
15
hw/virtio/virtio-balloon.c | 63 +++++++++++++++++++++++++++++++++++-
15
hw/virtio/virtio-balloon.c | 70 ++++++++++++++++++++++++++++++++++++
16
include/hw/virtio/virtio-balloon.h | 2 +
16
include/hw/virtio/virtio-balloon.h | 2 +
17
2 files changed, 62 insertions(+), 3 deletions(-)
17
2 files changed, 71 insertions(+), 1 deletion(-)
18
18
19
diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
19
diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
20
index XXXXXXX..XXXXXXX 100644
20
index XXXXXXX..XXXXXXX 100644
21
--- a/hw/virtio/virtio-balloon.c
21
--- a/hw/virtio/virtio-balloon.c
22
+++ b/hw/virtio/virtio-balloon.c
22
+++ b/hw/virtio/virtio-balloon.c
...
...
30
+ VirtQueueElement *elem;
30
+ VirtQueueElement *elem;
31
+
31
+
32
+ while ((elem = virtqueue_pop(vq, sizeof(VirtQueueElement)))) {
32
+ while ((elem = virtqueue_pop(vq, sizeof(VirtQueueElement)))) {
33
+ unsigned int i;
33
+ unsigned int i;
34
+
34
+
35
+ if (qemu_balloon_is_inhibited() || dev->poison_val) {
36
+ goto skip_element;
37
+ }
38
+
35
+ for (i = 0; i < elem->in_num; i++) {
39
+ for (i = 0; i < elem->in_num; i++) {
36
+ void *addr = elem->in_sg[i].iov_base;
40
+ void *addr = elem->in_sg[i].iov_base;
37
+ size_t size = elem->in_sg[i].iov_len;
41
+ size_t size = elem->in_sg[i].iov_len;
38
+ ram_addr_t ram_offset;
42
+ ram_addr_t ram_offset;
39
+ size_t rb_page_size;
40
+ RAMBlock *rb;
43
+ RAMBlock *rb;
41
+
42
+ if (qemu_balloon_is_inhibited() || dev->poison_val) {
43
+ continue;
44
+ }
45
+
44
+
46
+ /*
45
+ /*
47
+ * There is no need to check the memory section to see if
46
+ * There is no need to check the memory section to see if
48
+ * it is ram/readonly/romd like there is for handle_output
47
+ * it is ram/readonly/romd like there is for handle_output
49
+ * below. If the region is not meant to be written to then
48
+ * below. If the region is not meant to be written to then
...
...
58
+ if (!rb) {
57
+ if (!rb) {
59
+ trace_virtio_balloon_bad_addr(elem->in_addr[i]);
58
+ trace_virtio_balloon_bad_addr(elem->in_addr[i]);
60
+ continue;
59
+ continue;
61
+ }
60
+ }
62
+
61
+
63
+ /* For now we will simply ignore unaligned memory regions */
62
+ /*
64
+ rb_page_size = qemu_ram_pagesize(rb);
63
+ * For now we will simply ignore unaligned memory regions, or
65
+ if (!QEMU_IS_ALIGNED(ram_offset | size, rb_page_size)) {
64
+ * regions that overrun the end of the RAMBlock.
65
+ */
66
+ if (!QEMU_IS_ALIGNED(ram_offset | size, qemu_ram_pagesize(rb)) ||
67
+ (ram_offset + size) > qemu_ram_get_used_length(rb)) {
66
+ continue;
68
+ continue;
67
+ }
69
+ }
68
+
70
+
69
+ ram_block_discard_range(rb, ram_offset, size);
71
+ ram_block_discard_range(rb, ram_offset, size);
70
+ }
72
+ }
71
+
73
+
74
+skip_element:
72
+ virtqueue_push(vq, elem, 0);
75
+ virtqueue_push(vq, elem, 0);
73
+ virtio_notify(vdev, vq);
76
+ virtio_notify(vdev, vq);
74
+ g_free(elem);
77
+ g_free(elem);
75
+ }
78
+ }
76
+}
79
+}
77
+
80
+
78
static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
81
static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
79
{
82
{
80
VirtIOBalloon *s = VIRTIO_BALLOON(vdev);
83
VirtIOBalloon *s = VIRTIO_BALLOON(vdev);
81
@@ -XXX,XX +XXX,XX @@ static size_t virtio_balloon_config_size(VirtIOBalloon *s)
84
@@ -XXX,XX +XXX,XX @@ static void virtio_balloon_device_realize(DeviceState *dev, Error **errp)
82
return sizeof(struct virtio_balloon_config);
85
VirtIOBalloon *s = VIRTIO_BALLOON(dev);
83
}
86
int ret;
84
if (virtio_has_feature(features, VIRTIO_BALLOON_F_PAGE_POISON) ||
87
85
- virtio_has_feature(features, VIRTIO_BALLOON_F_FREE_PAGE_HINT)) {
88
+ /*
86
+ virtio_has_feature(features, VIRTIO_BALLOON_F_FREE_PAGE_HINT) ||
89
+ * Page reporting is dependant on page poison to make sure we can
87
+ virtio_has_feature(features, VIRTIO_BALLOON_F_REPORTING)) {
90
+ * report a page without changing the state of the internal data.
88
return sizeof(struct virtio_balloon_config);
91
+ * We need to set the flag before we call virtio_init as it will
89
}
92
+ * affect the config size of the vdev.
90
return offsetof(struct virtio_balloon_config, free_page_report_cmd_id);
93
+ */
91
@@ -XXX,XX +XXX,XX @@ static uint64_t virtio_balloon_get_features(VirtIODevice *vdev, uint64_t f,
94
+ if (virtio_has_feature(s->host_features, VIRTIO_BALLOON_F_REPORTING)) {
92
VirtIOBalloon *dev = VIRTIO_BALLOON(vdev);
95
+ s->host_features |= 1 << VIRTIO_BALLOON_F_PAGE_POISON;
93
f |= dev->host_features;
96
+ }
94
virtio_add_feature(&f, VIRTIO_BALLOON_F_STATS_VQ);
97
+
95
- if (virtio_has_feature(f, VIRTIO_BALLOON_F_FREE_PAGE_HINT)) {
98
virtio_init(vdev, "virtio-balloon", VIRTIO_ID_BALLOON,
96
+ if (virtio_has_feature(f, VIRTIO_BALLOON_F_FREE_PAGE_HINT) ||
99
virtio_balloon_config_size(s));
97
+ virtio_has_feature(f, VIRTIO_BALLOON_F_REPORTING)) {
98
virtio_add_feature(&f, VIRTIO_BALLOON_F_PAGE_POISON);
99
}
100
100
101
@@ -XXX,XX +XXX,XX @@ static void virtio_balloon_device_realize(DeviceState *dev, Error **errp)
101
@@ -XXX,XX +XXX,XX @@ static void virtio_balloon_device_realize(DeviceState *dev, Error **errp)
102
s->dvq = virtio_add_queue(vdev, 128, virtio_balloon_handle_output);
102
s->dvq = virtio_add_queue(vdev, 128, virtio_balloon_handle_output);
103
s->svq = virtio_add_queue(vdev, 128, virtio_balloon_receive_stats);
103
s->svq = virtio_add_queue(vdev, 128, virtio_balloon_receive_stats);
104
104
...
...
108
+
108
+
109
if (virtio_has_feature(s->host_features,
109
if (virtio_has_feature(s->host_features,
110
VIRTIO_BALLOON_F_FREE_PAGE_HINT)) {
110
VIRTIO_BALLOON_F_FREE_PAGE_HINT)) {
111
s->free_page_vq = virtio_add_queue(vdev, VIRTQUEUE_MAX_SIZE,
111
s->free_page_vq = virtio_add_queue(vdev, VIRTQUEUE_MAX_SIZE,
112
@@ -XXX,XX +XXX,XX @@ static Property virtio_balloon_properties[] = {
112
@@ -XXX,XX +XXX,XX @@ static Property virtio_balloon_properties[] = {
113
*/
113
VIRTIO_BALLOON_F_DEFLATE_ON_OOM, false),
114
DEFINE_PROP_BOOL("qemu-4-0-config-size", VirtIOBalloon,
114
DEFINE_PROP_BIT("free-page-hint", VirtIOBalloon, host_features,
115
qemu_4_0_config_size, false),
115
VIRTIO_BALLOON_F_FREE_PAGE_HINT, false),
116
+ DEFINE_PROP_BIT("free-page-reporting", VirtIOBalloon, host_features,
116
+ DEFINE_PROP_BIT("free-page-reporting", VirtIOBalloon, host_features,
117
+ VIRTIO_BALLOON_F_REPORTING, true),
117
+ VIRTIO_BALLOON_F_REPORTING, true),
118
DEFINE_PROP_LINK("iothread", VirtIOBalloon, iothread, TYPE_IOTHREAD,
118
/* QEMU 4.0 accidentally changed the config size even when free-page-hint
119
IOThread *),
119
* is disabled, resulting in QEMU 3.1 migration incompatibility. This
120
DEFINE_PROP_END_OF_LIST(),
120
* property retains this quirk for QEMU 4.1 machine types.
121
diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-balloon.h
121
diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-balloon.h
122
index XXXXXXX..XXXXXXX 100644
122
index XXXXXXX..XXXXXXX 100644
123
--- a/include/hw/virtio/virtio-balloon.h
123
--- a/include/hw/virtio/virtio-balloon.h
124
+++ b/include/hw/virtio/virtio-balloon.h
124
+++ b/include/hw/virtio/virtio-balloon.h
125
@@ -XXX,XX +XXX,XX @@ enum virtio_balloon_free_page_report_status {
125
@@ -XXX,XX +XXX,XX @@ enum virtio_balloon_free_page_hint_status {
126
126
127
typedef struct VirtIOBalloon {
127
typedef struct VirtIOBalloon {
128
VirtIODevice parent_obj;
128
VirtIODevice parent_obj;
129
- VirtQueue *ivq, *dvq, *svq, *free_page_vq;
129
- VirtQueue *ivq, *dvq, *svq, *free_page_vq;
130
+ VirtQueue *ivq, *dvq, *svq, *free_page_vq, *rvq;
130
+ VirtQueue *ivq, *dvq, *svq, *free_page_vq, *rvq;
131
uint32_t free_page_report_status;
131
uint32_t free_page_hint_status;
132
uint32_t num_pages;
132
uint32_t num_pages;
133
uint32_t actual;
133
uint32_t actual;
134
134
135
135
diff view generated by jsdifflib