1
The following changes since commit 3521ade3510eb5cefb2e27a101667f25dad89935:
1
The following changes since commit 00b1faea41d283e931256aa78aa975a369ec3ae6:
2
2
3
Merge remote-tracking branch 'remotes/thuth-gitlab/tags/pull-request-2021-07-29' into staging (2021-07-29 13:17:20 +0100)
3
Merge tag 'pull-target-arm-20230123' of https://git.linaro.org/people/pmaydell/qemu-arm into staging (2023-01-23 13:40:28 +0000)
4
4
5
are available in the Git repository at:
5
are available in the Git repository at:
6
6
7
https://gitlab.com/stefanha/qemu.git tags/block-pull-request
7
https://gitlab.com/stefanha/qemu.git tags/block-pull-request
8
8
9
for you to fetch changes up to cc8eecd7f105a1dff5876adeb238a14696061a4a:
9
for you to fetch changes up to 4f01a9bb0461e8c11ee0c94d90a504cb7d580a85:
10
10
11
MAINTAINERS: Added myself as a reviewer for the NVMe Block Driver (2021-07-29 17:17:34 +0100)
11
block/blkio: Fix inclusion of required headers (2023-01-23 15:02:07 -0500)
12
12
13
----------------------------------------------------------------
13
----------------------------------------------------------------
14
Pull request
14
Pull request
15
15
16
The main fix here is for io_uring. Spurious -EAGAIN errors can happen and the
17
request needs to be resubmitted.
18
19
The MAINTAINERS changes carry no risk and we might as well include them in QEMU
20
6.1.
21
22
----------------------------------------------------------------
16
----------------------------------------------------------------
23
17
24
Fabian Ebner (1):
18
Chao Gao (1):
25
block/io_uring: resubmit when result is -EAGAIN
19
util/aio: Defer disabling poll mode as long as possible
26
20
27
Philippe Mathieu-Daudé (1):
21
Peter Krempa (1):
28
MAINTAINERS: Added myself as a reviewer for the NVMe Block Driver
22
block/blkio: Fix inclusion of required headers
29
23
30
Stefano Garzarella (1):
24
Stefan Hajnoczi (1):
31
MAINTAINERS: add Stefano Garzarella as io_uring reviewer
25
virtio-blk: simplify virtio_blk_dma_restart_cb()
32
26
33
MAINTAINERS | 2 ++
27
include/hw/virtio/virtio-blk.h | 2 --
34
block/io_uring.c | 16 +++++++++++++++-
28
block/blkio.c | 2 ++
35
2 files changed, 17 insertions(+), 1 deletion(-)
29
hw/block/dataplane/virtio-blk.c | 17 +++++-------
30
hw/block/virtio-blk.c | 46 ++++++++++++++-------------------
31
util/aio-posix.c | 21 ++++++++++-----
32
5 files changed, 43 insertions(+), 45 deletions(-)
36
33
37
--
34
--
38
2.31.1
35
2.39.0
39
diff view generated by jsdifflib
1
From: Philippe Mathieu-Daudé <philmd@redhat.com>
1
From: Chao Gao <chao.gao@intel.com>
2
2
3
I'm interested in following the activity around the NVMe bdrv.
3
When we measure FIO read performance (cache=writethrough, bs=4k,
4
iodepth=64) in VMs, ~80K/s notifications (e.g., EPT_MISCONFIG) are observed
5
from guest to qemu.
4
6
5
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
7
It turns out those frequent notificatons are caused by interference from
6
Message-id: 20210728183340.2018313-1-philmd@redhat.com
8
worker threads. Worker threads queue bottom halves after completing IO
9
requests. Pending bottom halves may lead to either aio_compute_timeout()
10
zeros timeout and pass it to try_poll_mode() or run_poll_handlers() returns
11
no progress after noticing pending aio_notify() events. Both cause
12
run_poll_handlers() to call poll_set_started(false) to disable poll mode.
13
However, for both cases, as timeout is already zeroed, the event loop
14
(i.e., aio_poll()) just processes bottom halves and then starts the next
15
event loop iteration. So, disabling poll mode has no value but leads to
16
unnecessary notifications from guest.
17
18
To minimize unnecessary notifications from guest, defer disabling poll
19
mode to when the event loop is about to be blocked.
20
21
With this patch applied, FIO seq-read performance (bs=4k, iodepth=64,
22
cache=writethrough) in VMs increases from 330K/s to 413K/s IOPS.
23
24
Suggested-by: Stefan Hajnoczi <stefanha@redhat.com>
25
Signed-off-by: Chao Gao <chao.gao@intel.com>
26
Message-id: 20220710120849.63086-1-chao.gao@intel.com
7
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
27
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
8
---
28
---
9
MAINTAINERS | 1 +
29
util/aio-posix.c | 21 +++++++++++++++------
10
1 file changed, 1 insertion(+)
30
1 file changed, 15 insertions(+), 6 deletions(-)
11
31
12
diff --git a/MAINTAINERS b/MAINTAINERS
32
diff --git a/util/aio-posix.c b/util/aio-posix.c
13
index XXXXXXX..XXXXXXX 100644
33
index XXXXXXX..XXXXXXX 100644
14
--- a/MAINTAINERS
34
--- a/util/aio-posix.c
15
+++ b/MAINTAINERS
35
+++ b/util/aio-posix.c
16
@@ -XXX,XX +XXX,XX @@ F: block/null.c
36
@@ -XXX,XX +XXX,XX @@ static bool try_poll_mode(AioContext *ctx, AioHandlerList *ready_list,
17
NVMe Block Driver
37
18
M: Stefan Hajnoczi <stefanha@redhat.com>
38
max_ns = qemu_soonest_timeout(*timeout, ctx->poll_ns);
19
R: Fam Zheng <fam@euphon.net>
39
if (max_ns && !ctx->fdmon_ops->need_wait(ctx)) {
20
+R: Philippe Mathieu-Daudé <philmd@redhat.com>
40
+ /*
21
L: qemu-block@nongnu.org
41
+ * Enable poll mode. It pairs with the poll_set_started() in
22
S: Supported
42
+ * aio_poll() which disables poll mode.
23
F: block/nvme*
43
+ */
44
poll_set_started(ctx, ready_list, true);
45
46
if (run_poll_handlers(ctx, ready_list, max_ns, timeout)) {
47
return true;
48
}
49
}
50
-
51
- if (poll_set_started(ctx, ready_list, false)) {
52
- *timeout = 0;
53
- return true;
54
- }
55
-
56
return false;
57
}
58
59
@@ -XXX,XX +XXX,XX @@ bool aio_poll(AioContext *ctx, bool blocking)
60
* system call---a single round of run_poll_handlers_once suffices.
61
*/
62
if (timeout || ctx->fdmon_ops->need_wait(ctx)) {
63
+ /*
64
+ * Disable poll mode. poll mode should be disabled before the call
65
+ * of ctx->fdmon_ops->wait() so that guest's notification can wake
66
+ * up IO threads when some work becomes pending. It is essential to
67
+ * avoid hangs or unnecessary latency.
68
+ */
69
+ if (poll_set_started(ctx, &ready_list, false)) {
70
+ timeout = 0;
71
+ progress = true;
72
+ }
73
+
74
ctx->fdmon_ops->wait(ctx, &ready_list, timeout);
75
}
76
24
--
77
--
25
2.31.1
78
2.39.0
26
diff view generated by jsdifflib
1
From: Fabian Ebner <f.ebner@proxmox.com>
1
virtio_blk_dma_restart_cb() is tricky because the BH must deal with
2
virtio_blk_data_plane_start()/virtio_blk_data_plane_stop() being called.
2
3
3
Linux SCSI can throw spurious -EAGAIN in some corner cases in its
4
There are two issues with the code:
4
completion path, which will end up being the result in the completed
5
io_uring request.
6
5
7
Resubmitting such requests should allow block jobs to complete, even
6
1. virtio_blk_realize() should use qdev_add_vm_change_state_handler()
8
if such spurious errors are encountered.
7
instead of qemu_add_vm_change_state_handler(). This ensures the
8
ordering with virtio_init()'s vm change state handler that calls
9
virtio_blk_data_plane_start()/virtio_blk_data_plane_stop() is
10
well-defined. Then blk's AioContext is guaranteed to be up-to-date in
11
virtio_blk_dma_restart_cb() and it's no longer necessary to have a
12
special case for virtio_blk_data_plane_start().
9
13
10
Co-authored-by: Stefan Hajnoczi <stefanha@gmail.com>
14
2. Only blk_drain() waits for virtio_blk_dma_restart_cb()'s
11
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
15
blk_inc_in_flight() to be decremented. The bdrv_drain() family of
12
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
16
functions do not wait for BlockBackend's in_flight counter to reach
13
Message-id: 20210729091029.65369-1-f.ebner@proxmox.com
17
zero. virtio_blk_data_plane_stop() relies on blk_set_aio_context()'s
18
implicit drain, but that's a bdrv_drain() and not a blk_drain().
19
Note that virtio_blk_reset() already correctly relies on blk_drain().
20
If virtio_blk_data_plane_stop() switches to blk_drain() then we can
21
properly wait for pending virtio_blk_dma_restart_bh() calls.
22
23
Once these issues are taken care of the code becomes simpler. This
24
change is in preparation for multiple IOThreads in virtio-blk where we
25
need to clean up the multi-threading behavior.
26
27
I ran the reproducer from commit 49b44549ace7 ("virtio-blk: On restart,
28
process queued requests in the proper context") to check that there is
29
no regression.
30
31
Cc: Sergio Lopez <slp@redhat.com>
32
Cc: Kevin Wolf <kwolf@redhat.com>
33
Cc: Emanuele Giuseppe Esposito <eesposit@redhat.com>
34
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
35
Acked-by: Michael S. Tsirkin <mst@redhat.com>
36
Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
37
Message-id: 20221102182337.252202-1-stefanha@redhat.com
14
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
38
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
15
---
39
---
16
block/io_uring.c | 16 +++++++++++++++-
40
include/hw/virtio/virtio-blk.h | 2 --
17
1 file changed, 15 insertions(+), 1 deletion(-)
41
hw/block/dataplane/virtio-blk.c | 17 +++++-------
42
hw/block/virtio-blk.c | 46 ++++++++++++++-------------------
43
3 files changed, 26 insertions(+), 39 deletions(-)
18
44
19
diff --git a/block/io_uring.c b/block/io_uring.c
45
diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h
20
index XXXXXXX..XXXXXXX 100644
46
index XXXXXXX..XXXXXXX 100644
21
--- a/block/io_uring.c
47
--- a/include/hw/virtio/virtio-blk.h
22
+++ b/block/io_uring.c
48
+++ b/include/hw/virtio/virtio-blk.h
23
@@ -XXX,XX +XXX,XX @@ static void luring_process_completions(LuringState *s)
49
@@ -XXX,XX +XXX,XX @@ struct VirtIOBlock {
24
total_bytes = ret + luringcb->total_read;
50
VirtIODevice parent_obj;
25
51
BlockBackend *blk;
26
if (ret < 0) {
52
void *rq;
27
- if (ret == -EINTR) {
53
- QEMUBH *bh;
28
+ /*
54
VirtIOBlkConf conf;
29
+ * Only writev/readv/fsync requests on regular files or host block
55
unsigned short sector_mask;
30
+ * devices are submitted. Therefore -EAGAIN is not expected but it's
56
bool original_wce;
31
+ * known to happen sometimes with Linux SCSI. Submit again and hope
57
@@ -XXX,XX +XXX,XX @@ typedef struct MultiReqBuffer {
32
+ * the request completes successfully.
58
} MultiReqBuffer;
33
+ *
59
34
+ * For more information, see:
60
void virtio_blk_handle_vq(VirtIOBlock *s, VirtQueue *vq);
35
+ * https://lore.kernel.org/io-uring/20210727165811.284510-3-axboe@kernel.dk/T/#u
61
-void virtio_blk_process_queued_requests(VirtIOBlock *s, bool is_bh);
36
+ *
62
37
+ * If the code is changed to submit other types of requests in the
63
#endif
38
+ * future, then this workaround may need to be extended to deal with
64
diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
39
+ * genuine -EAGAIN results that should not be resubmitted
65
index XXXXXXX..XXXXXXX 100644
40
+ * immediately.
66
--- a/hw/block/dataplane/virtio-blk.c
41
+ */
67
+++ b/hw/block/dataplane/virtio-blk.c
42
+ if (ret == -EINTR || ret == -EAGAIN) {
68
@@ -XXX,XX +XXX,XX @@ int virtio_blk_data_plane_start(VirtIODevice *vdev)
43
luring_resubmit(s, luringcb);
69
goto fail_aio_context;
44
continue;
70
}
45
}
71
72
- /* Process queued requests before the ones in vring */
73
- virtio_blk_process_queued_requests(vblk, false);
74
-
75
/* Kick right away to begin processing requests already in vring */
76
for (i = 0; i < nvqs; i++) {
77
VirtQueue *vq = virtio_get_queue(s->vdev, i);
78
@@ -XXX,XX +XXX,XX @@ int virtio_blk_data_plane_start(VirtIODevice *vdev)
79
fail_host_notifiers:
80
k->set_guest_notifiers(qbus->parent, nvqs, false);
81
fail_guest_notifiers:
82
- /*
83
- * If we failed to set up the guest notifiers queued requests will be
84
- * processed on the main context.
85
- */
86
- virtio_blk_process_queued_requests(vblk, false);
87
vblk->dataplane_disabled = true;
88
s->starting = false;
89
vblk->dataplane_started = true;
90
@@ -XXX,XX +XXX,XX @@ void virtio_blk_data_plane_stop(VirtIODevice *vdev)
91
aio_context_acquire(s->ctx);
92
aio_wait_bh_oneshot(s->ctx, virtio_blk_data_plane_stop_bh, s);
93
94
- /* Drain and try to switch bs back to the QEMU main loop. If other users
95
- * keep the BlockBackend in the iothread, that's ok */
96
+ /* Wait for virtio_blk_dma_restart_bh() and in flight I/O to complete */
97
+ blk_drain(s->conf->conf.blk);
98
+
99
+ /*
100
+ * Try to switch bs back to the QEMU main loop. If other users keep the
101
+ * BlockBackend in the iothread, that's ok
102
+ */
103
blk_set_aio_context(s->conf->conf.blk, qemu_get_aio_context(), NULL);
104
105
aio_context_release(s->ctx);
106
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
107
index XXXXXXX..XXXXXXX 100644
108
--- a/hw/block/virtio-blk.c
109
+++ b/hw/block/virtio-blk.c
110
@@ -XXX,XX +XXX,XX @@ static void virtio_blk_handle_output(VirtIODevice *vdev, VirtQueue *vq)
111
virtio_blk_handle_vq(s, vq);
112
}
113
114
-void virtio_blk_process_queued_requests(VirtIOBlock *s, bool is_bh)
115
+static void virtio_blk_dma_restart_bh(void *opaque)
116
{
117
+ VirtIOBlock *s = opaque;
118
+
119
VirtIOBlockReq *req = s->rq;
120
MultiReqBuffer mrb = {};
121
122
@@ -XXX,XX +XXX,XX @@ void virtio_blk_process_queued_requests(VirtIOBlock *s, bool is_bh)
123
if (mrb.num_reqs) {
124
virtio_blk_submit_multireq(s, &mrb);
125
}
126
- if (is_bh) {
127
- blk_dec_in_flight(s->conf.conf.blk);
128
- }
129
+
130
+ /* Paired with inc in virtio_blk_dma_restart_cb() */
131
+ blk_dec_in_flight(s->conf.conf.blk);
132
+
133
aio_context_release(blk_get_aio_context(s->conf.conf.blk));
134
}
135
136
-static void virtio_blk_dma_restart_bh(void *opaque)
137
-{
138
- VirtIOBlock *s = opaque;
139
-
140
- qemu_bh_delete(s->bh);
141
- s->bh = NULL;
142
-
143
- virtio_blk_process_queued_requests(s, true);
144
-}
145
-
146
static void virtio_blk_dma_restart_cb(void *opaque, bool running,
147
RunState state)
148
{
149
VirtIOBlock *s = opaque;
150
- BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(s)));
151
- VirtioBusState *bus = VIRTIO_BUS(qbus);
152
153
if (!running) {
154
return;
155
}
156
157
- /*
158
- * If ioeventfd is enabled, don't schedule the BH here as queued
159
- * requests will be processed while starting the data plane.
160
- */
161
- if (!s->bh && !virtio_bus_ioeventfd_enabled(bus)) {
162
- s->bh = aio_bh_new(blk_get_aio_context(s->conf.conf.blk),
163
- virtio_blk_dma_restart_bh, s);
164
- blk_inc_in_flight(s->conf.conf.blk);
165
- qemu_bh_schedule(s->bh);
166
- }
167
+ /* Paired with dec in virtio_blk_dma_restart_bh() */
168
+ blk_inc_in_flight(s->conf.conf.blk);
169
+
170
+ aio_bh_schedule_oneshot(blk_get_aio_context(s->conf.conf.blk),
171
+ virtio_blk_dma_restart_bh, s);
172
}
173
174
static void virtio_blk_reset(VirtIODevice *vdev)
175
@@ -XXX,XX +XXX,XX @@ static void virtio_blk_device_realize(DeviceState *dev, Error **errp)
176
return;
177
}
178
179
- s->change = qemu_add_vm_change_state_handler(virtio_blk_dma_restart_cb, s);
180
+ /*
181
+ * This must be after virtio_init() so virtio_blk_dma_restart_cb() gets
182
+ * called after ->start_ioeventfd() has already set blk's AioContext.
183
+ */
184
+ s->change =
185
+ qdev_add_vm_change_state_handler(dev, virtio_blk_dma_restart_cb, s);
186
+
187
blk_ram_registrar_init(&s->blk_ram_registrar, s->blk);
188
blk_set_dev_ops(s->blk, &virtio_block_ops, s);
189
46
--
190
--
47
2.31.1
191
2.39.0
48
diff view generated by jsdifflib
1
From: Stefano Garzarella <sgarzare@redhat.com>
1
From: Peter Krempa <pkrempa@redhat.com>
2
2
3
I've been working with io_uring for a while so I'd like to help
3
After recent header file inclusion rework the build fails when the blkio
4
with reviews.
4
module is enabled:
5
5
6
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
6
../block/blkio.c: In function ‘blkio_detach_aio_context’:
7
Message-Id: <20210728131515.131045-1-sgarzare@redhat.com>
7
../block/blkio.c:321:24: error: implicit declaration of function ‘bdrv_get_aio_context’; did you mean ‘qemu_get_aio_context’? [-Werror=implicit-function-declaration]
8
321 | aio_set_fd_handler(bdrv_get_aio_context(bs),
9
| ^~~~~~~~~~~~~~~~~~~~
10
| qemu_get_aio_context
11
../block/blkio.c:321:24: error: nested extern declaration of ‘bdrv_get_aio_context’ [-Werror=nested-externs]
12
../block/blkio.c:321:24: error: passing argument 1 of ‘aio_set_fd_handler’ makes pointer from integer without a cast [-Werror=int-conversion]
13
321 | aio_set_fd_handler(bdrv_get_aio_context(bs),
14
| ^~~~~~~~~~~~~~~~~~~~~~~~
15
| |
16
| int
17
In file included from /home/pipo/git/qemu.git/include/qemu/job.h:33,
18
from /home/pipo/git/qemu.git/include/block/blockjob.h:30,
19
from /home/pipo/git/qemu.git/include/block/block_int-global-state.h:28,
20
from /home/pipo/git/qemu.git/include/block/block_int.h:27,
21
from ../block/blkio.c:13:
22
/home/pipo/git/qemu.git/include/block/aio.h:476:37: note: expected ‘AioContext *’ but argument is of type ‘int’
23
476 | void aio_set_fd_handler(AioContext *ctx,
24
| ~~~~~~~~~~~~^~~
25
../block/blkio.c: In function ‘blkio_file_open’:
26
../block/blkio.c:821:34: error: passing argument 2 of ‘blkio_attach_aio_context’ makes pointer from integer without a cast [-Werror=int-conversion]
27
821 | blkio_attach_aio_context(bs, bdrv_get_aio_context(bs));
28
| ^~~~~~~~~~~~~~~~~~~~~~~~
29
| |
30
| int
31
32
Fix it by including 'block/block-io.h' which contains the required
33
declarations.
34
35
Fixes: e2c1c34f139f49ef909bb4322607fb8b39002312
36
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
37
Reviewed-by: Markus Armbruster <armbru@redhat.com>
38
Message-id: 2bc956011404a1ab03342aefde0087b5b4762562.1674477350.git.pkrempa@redhat.com
8
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
39
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
9
---
40
---
10
MAINTAINERS | 1 +
41
block/blkio.c | 2 ++
11
1 file changed, 1 insertion(+)
42
1 file changed, 2 insertions(+)
12
43
13
diff --git a/MAINTAINERS b/MAINTAINERS
44
diff --git a/block/blkio.c b/block/blkio.c
14
index XXXXXXX..XXXXXXX 100644
45
index XXXXXXX..XXXXXXX 100644
15
--- a/MAINTAINERS
46
--- a/block/blkio.c
16
+++ b/MAINTAINERS
47
+++ b/block/blkio.c
17
@@ -XXX,XX +XXX,XX @@ Linux io_uring
48
@@ -XXX,XX +XXX,XX @@
18
M: Aarushi Mehta <mehta.aaru20@gmail.com>
49
#include "qemu/module.h"
19
M: Julia Suvorova <jusual@redhat.com>
50
#include "exec/memory.h" /* for ram_block_discard_disable() */
20
M: Stefan Hajnoczi <stefanha@redhat.com>
51
21
+R: Stefano Garzarella <sgarzare@redhat.com>
52
+#include "block/block-io.h"
22
L: qemu-block@nongnu.org
53
+
23
S: Maintained
54
/*
24
F: block/io_uring.c
55
* Keep the QEMU BlockDriver names identical to the libblkio driver names.
56
* Using macros instead of typing out the string literals avoids typos.
25
--
57
--
26
2.31.1
58
2.39.0
27
59
60
diff view generated by jsdifflib