1
The following changes since commit ad1b4ec39caa5b3f17cbd8160283a03a3dcfe2ae:
1
The following changes since commit 281f327487c9c9b1599f93c589a408bbf4a651b8:
2
2
3
Merge remote-tracking branch 'remotes/kraxel/tags/input-20180515-pull-request' into staging (2018-05-15 12:50:06 +0100)
3
Merge remote-tracking branch 'remotes/vivier/tags/m68k-for-2.12-pull-request' into staging (2017-12-22 00:11:36 +0000)
4
4
5
are available in the git repository at:
5
are available in the git repository at:
6
6
7
git://repo.or.cz/qemu/kevin.git tags/for-upstream
7
git://repo.or.cz/qemu/kevin.git tags/for-upstream
8
8
9
for you to fetch changes up to 1fce860ea5eba1ca00a67911fc0b8a5d80009514:
9
for you to fetch changes up to 1a63a907507fbbcfaee3f622907ec244b7eabda8:
10
10
11
Merge remote-tracking branch 'mreitz/tags/pull-block-2018-05-15' into queue-block (2018-05-15 16:19:53 +0200)
11
block: Keep nodes drained between reopen_queue/multiple (2017-12-22 15:05:32 +0100)
12
12
13
----------------------------------------------------------------
13
----------------------------------------------------------------
14
Block layer patches:
14
Block layer patches
15
15
16
- Switch AIO/callback based block drivers to a byte-based interface
17
- Block jobs: Expose error string via query-block-jobs
18
- Block job cleanups and fixes
19
- hmp: Allow using a qdev id in block_set_io_throttle
20
- Copy-on-read block driver
21
- The qcow2 default refcount cache size has been decreased
22
- Various bug fixes
23
----------------------------------------------------------------
16
----------------------------------------------------------------
24
Alberto Garcia (5):
17
Doug Gale (1):
25
hmp: Allow using a qdev id in block_set_io_throttle
18
nvme: Add tracing
26
Fix error message about compressed clusters with OFLAG_COPIED
27
specs/qcow2: Clarify that compressed clusters have the COPIED bit reset
28
qcow2: Give the refcount cache the minimum possible size by default
29
docs: Document the new default sizes of the qcow2 caches
30
19
31
Daniel Henrique Barboza (1):
20
Edgar Kaziakhmedov (1):
32
block-backend: simplify blk_get_aio_context
21
qcow2: get rid of qcow2_backing_read1 routine
33
22
34
Eric Blake (7):
23
Fam Zheng (2):
35
block: Support byte-based aio callbacks
24
block: Open backing image in force share mode for size probe
36
file-win32: Switch to byte-based callbacks
25
block: Remove unused bdrv_requests_pending
37
null: Switch to byte-based read/write
38
rbd: Switch to byte-based callbacks
39
vxhs: Switch to byte-based callbacks
40
block: Drop last of the sector-based aio callbacks
41
block: Merge .bdrv_co_writev{,_flags} in drivers
42
26
43
John Snow (1):
27
John Snow (1):
44
blockjob: expose error string via query
28
iotests: fix 197 for vpc
45
29
46
Kevin Wolf (7):
30
Kevin Wolf (27):
47
blockjob: Fix assertion in block_job_finalize()
31
block: Formats don't need CONSISTENT_READ with NO_IO
48
blockjob: Wrappers for progress counter access
32
block: Make bdrv_drain_invoke() recursive
49
blockjob: Move RateLimit to BlockJob
33
block: Call .drain_begin only once in bdrv_drain_all_begin()
50
blockjob: Implement block_job_set_speed() centrally
34
test-bdrv-drain: Test BlockDriver callbacks for drain
51
blockjob: Introduce block_job_ratelimit_get_delay()
35
block: bdrv_drain_recurse(): Remove unused begin parameter
52
blockjob: Add block_job_driver()
36
block: Don't wait for requests in bdrv_drain*_end()
53
Merge remote-tracking branch 'mreitz/tags/pull-block-2018-05-15' into queue-block
37
block: Unify order in drain functions
38
block: Don't acquire AioContext in hmp_qemu_io()
39
block: Document that x-blockdev-change breaks quorum children list
40
block: Assert drain_all is only called from main AioContext
41
block: Make bdrv_drain() driver callbacks non-recursive
42
test-bdrv-drain: Test callback for bdrv_drain
43
test-bdrv-drain: Test bs->quiesce_counter
44
blockjob: Pause job on draining any job BDS
45
test-bdrv-drain: Test drain vs. block jobs
46
block: Don't block_job_pause_all() in bdrv_drain_all()
47
block: Nested drain_end must still call callbacks
48
test-bdrv-drain: Test nested drain sections
49
block: Don't notify parents in drain call chain
50
block: Add bdrv_subtree_drained_begin/end()
51
test-bdrv-drain: Tests for bdrv_subtree_drain
52
test-bdrv-drain: Test behaviour in coroutine context
53
test-bdrv-drain: Recursive draining with multiple parents
54
block: Allow graph changes in subtree drained section
55
test-bdrv-drain: Test graph changes in drained section
56
commit: Simplify reopen of base
57
block: Keep nodes drained between reopen_queue/multiple
54
58
55
Max Reitz (17):
59
Thomas Huth (3):
56
iotests: Split 214 off of 122
60
block: Remove the obsolete -drive boot=on|off parameter
57
iotests: Add failure matching to common.qemu
61
block: Remove the deprecated -hdachs option
58
iotests: Skip 181 and 201 without userfaultfd
62
block: Mention -drive cyls/heads/secs/trans/serial/addr in deprecation chapter
59
block: Add COR filter driver
60
block: BLK_PERM_WRITE includes ..._UNCHANGED
61
block: Add BDRV_REQ_WRITE_UNCHANGED flag
62
block: Set BDRV_REQ_WRITE_UNCHANGED for COR writes
63
block/quorum: Support BDRV_REQ_WRITE_UNCHANGED
64
block: Support BDRV_REQ_WRITE_UNCHANGED in filters
65
iotests: Clean up wrap image in 197
66
iotests: Copy 197 for COR filter driver
67
iotests: Add test for COR across nodes
68
qemu-img: Check post-truncation size
69
block: Document BDRV_REQ_WRITE_UNCHANGED support
70
qemu-io: Use purely string blockdev options
71
qemu-img: Use only string options in img_open_opts
72
iotests: Add test for -U/force-share conflicts
73
63
74
qapi/block-core.json | 11 ++-
64
qapi/block-core.json | 4 +
75
docs/interop/qcow2.txt | 8 +-
65
block/qcow2.h | 3 -
76
docs/qcow2-cache.txt | 33 ++++----
66
include/block/block.h | 15 +-
77
block/qcow2.h | 4 -
67
include/block/block_int.h | 6 +-
78
include/block/block.h | 9 ++-
68
block.c | 75 ++++-
79
include/block/block_int.h | 28 +++++--
69
block/commit.c | 8 +-
80
include/block/blockjob.h | 32 ++++++++
70
block/io.c | 164 +++++++---
81
include/block/blockjob_int.h | 11 ++-
71
block/qcow2.c | 51 +--
82
include/block/raw-aio.h | 2 +-
72
block/replication.c | 6 +
83
block/backup.c | 62 ++++++---------
73
blockdev.c | 11 -
84
block/blkdebug.c | 9 ++-
74
blockjob.c | 22 +-
85
block/blkreplay.c | 3 +
75
hmp.c | 6 -
86
block/blkverify.c | 3 +
76
hw/block/nvme.c | 349 +++++++++++++++++----
87
block/block-backend.c | 8 +-
77
qemu-io-cmds.c | 3 +
88
block/commit.c | 35 +++------
78
tests/test-bdrv-drain.c | 651 +++++++++++++++++++++++++++++++++++++++
89
block/copy-on-read.c | 173 +++++++++++++++++++++++++++++++++++++++++
79
vl.c | 86 +-----
90
block/file-win32.c | 47 ++++++-----
80
hw/block/trace-events | 93 ++++++
91
block/gluster.c | 4 +-
81
qemu-doc.texi | 29 +-
92
block/io.c | 75 ++++++++++--------
82
qemu-options.hx | 19 +-
93
block/iscsi.c | 8 +-
83
tests/Makefile.include | 2 +
94
block/mirror.c | 44 ++++-------
84
tests/qemu-iotests/197 | 4 +
95
block/null.c | 45 +++++------
85
tests/qemu-iotests/common.filter | 3 +-
96
block/parallels.c | 4 +-
86
22 files changed, 1294 insertions(+), 316 deletions(-)
97
block/qcow.c | 6 +-
87
create mode 100644 tests/test-bdrv-drain.c
98
block/qcow2-refcount.c | 4 +-
99
block/qcow2.c | 31 +++++---
100
block/qed.c | 3 +-
101
block/quorum.c | 19 +++--
102
block/raw-format.c | 9 ++-
103
block/rbd.c | 40 +++++-----
104
block/replication.c | 4 +-
105
block/sheepdog.c | 4 +-
106
block/ssh.c | 4 +-
107
block/stream.c | 33 +++-----
108
block/throttle.c | 6 +-
109
block/vhdx.c | 4 +-
110
block/vxhs.c | 43 +++++-----
111
block/win32-aio.c | 5 +-
112
blockjob.c | 40 +++++++---
113
hmp.c | 14 +++-
114
qemu-img.c | 43 ++++++++--
115
qemu-io.c | 4 +-
116
block/Makefile.objs | 2 +-
117
hmp-commands.hx | 3 +-
118
tests/qemu-iotests/122 | 47 -----------
119
tests/qemu-iotests/122.out | 33 --------
120
tests/qemu-iotests/137.out | 2 +-
121
tests/qemu-iotests/153 | 17 ++++
122
tests/qemu-iotests/153.out | 16 ++++
123
tests/qemu-iotests/181 | 13 ++++
124
tests/qemu-iotests/197 | 1 +
125
tests/qemu-iotests/201 | 13 ++++
126
tests/qemu-iotests/214 | 97 +++++++++++++++++++++++
127
tests/qemu-iotests/214.out | 35 +++++++++
128
tests/qemu-iotests/215 | 120 ++++++++++++++++++++++++++++
129
tests/qemu-iotests/215.out | 26 +++++++
130
tests/qemu-iotests/216 | 115 +++++++++++++++++++++++++++
131
tests/qemu-iotests/216.out | 28 +++++++
132
tests/qemu-iotests/common.qemu | 58 ++++++++++++--
133
tests/qemu-iotests/group | 3 +
134
60 files changed, 1174 insertions(+), 429 deletions(-)
135
create mode 100644 block/copy-on-read.c
136
create mode 100755 tests/qemu-iotests/214
137
create mode 100644 tests/qemu-iotests/214.out
138
create mode 100755 tests/qemu-iotests/215
139
create mode 100644 tests/qemu-iotests/215.out
140
create mode 100755 tests/qemu-iotests/216
141
create mode 100644 tests/qemu-iotests/216.out
142
88
diff view generated by jsdifflib
1
From: Max Reitz <mreitz@redhat.com>
1
Commit 1f4ad7d fixed 'qemu-img info' for raw images that are currently
2
in use as a mirror target. It is not enough for image formats, though,
3
as these still unconditionally request BLK_PERM_CONSISTENT_READ.
2
4
3
Signed-off-by: Max Reitz <mreitz@redhat.com>
5
As this permission is geared towards whether the guest-visible data is
4
Message-id: 20180502202051.15493-4-mreitz@redhat.com
6
consistent, and has no impact on whether the metadata is sane, and
5
Reviewed-by: Eric Blake <eblake@redhat.com>
7
'qemu-img info' does not read guest-visible data (except for the raw
6
Signed-off-by: Max Reitz <mreitz@redhat.com>
8
format), it makes sense to not require BLK_PERM_CONSISTENT_READ if there
9
is not going to be any guest I/O performed, regardless of image format.
10
11
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
7
---
12
---
8
tests/qemu-iotests/153 | 17 +++++++++++++++++
13
block.c | 6 +++++-
9
tests/qemu-iotests/153.out | 16 ++++++++++++++++
14
1 file changed, 5 insertions(+), 1 deletion(-)
10
2 files changed, 33 insertions(+)
11
15
12
diff --git a/tests/qemu-iotests/153 b/tests/qemu-iotests/153
16
diff --git a/block.c b/block.c
13
index XXXXXXX..XXXXXXX 100755
17
index XXXXXXX..XXXXXXX 100644
14
--- a/tests/qemu-iotests/153
18
--- a/block.c
15
+++ b/tests/qemu-iotests/153
19
+++ b/block.c
16
@@ -XXX,XX +XXX,XX @@ _run_cmd $QEMU_IO "${TEST_IMG}" -c 'write 0 512'
20
@@ -XXX,XX +XXX,XX @@ void bdrv_format_default_perms(BlockDriverState *bs, BdrvChild *c,
17
21
assert(role == &child_backing || role == &child_file);
18
_cleanup_qemu
22
19
23
if (!backing) {
20
+echo
24
+ int flags = bdrv_reopen_get_flags(reopen_queue, bs);
21
+echo "== Detecting -U and force-share conflicts =="
22
+
25
+
23
+echo
26
/* Apart from the modifications below, the same permissions are
24
+echo 'No conflict:'
27
* forwarded and left alone as for filters */
25
+$QEMU_IMG info -U --image-opts driver=null-co,force-share=on
28
bdrv_filter_default_perms(bs, c, role, reopen_queue, perm, shared,
26
+echo
29
@@ -XXX,XX +XXX,XX @@ void bdrv_format_default_perms(BlockDriverState *bs, BdrvChild *c,
27
+echo 'Conflict:'
30
28
+$QEMU_IMG info -U --image-opts driver=null-co,force-share=off
31
/* bs->file always needs to be consistent because of the metadata. We
29
+
32
* can never allow other users to resize or write to it. */
30
+echo
33
- perm |= BLK_PERM_CONSISTENT_READ;
31
+echo 'No conflict:'
34
+ if (!(flags & BDRV_O_NO_IO)) {
32
+$QEMU_IO -c 'open -r -U -o driver=null-co,force-share=on'
35
+ perm |= BLK_PERM_CONSISTENT_READ;
33
+echo
36
+ }
34
+echo 'Conflict:'
37
shared &= ~(BLK_PERM_WRITE | BLK_PERM_RESIZE);
35
+$QEMU_IO -c 'open -r -U -o driver=null-co,force-share=off'
38
} else {
36
+
39
/* We want consistent read from backing files if the parent needs it.
37
# success, all done
38
echo "*** done"
39
rm -f $seq.full
40
diff --git a/tests/qemu-iotests/153.out b/tests/qemu-iotests/153.out
41
index XXXXXXX..XXXXXXX 100644
42
--- a/tests/qemu-iotests/153.out
43
+++ b/tests/qemu-iotests/153.out
44
@@ -XXX,XX +XXX,XX @@ Is another process using the image?
45
Closing the other
46
47
_qemu_io_wrapper TEST_DIR/t.qcow2 -c write 0 512
48
+
49
+== Detecting -U and force-share conflicts ==
50
+
51
+No conflict:
52
+image: null-co://
53
+file format: null-co
54
+virtual size: 1.0G (1073741824 bytes)
55
+disk size: unavailable
56
+
57
+Conflict:
58
+qemu-img: --force-share/-U conflicts with image options
59
+
60
+No conflict:
61
+
62
+Conflict:
63
+-U conflicts with image options
64
*** done
65
--
40
--
66
2.13.6
41
2.13.6
67
42
68
43
diff view generated by jsdifflib
1
From: Max Reitz <mreitz@redhat.com>
1
From: John Snow <jsnow@redhat.com>
2
2
3
Signed-off-by: Max Reitz <mreitz@redhat.com>
3
VPC has some difficulty creating geometries of particular size.
4
However, we can indeed force it to use a literal one, so let's
5
do that for the sake of test 197, which is testing some specific
6
offsets.
7
8
Signed-off-by: John Snow <jsnow@redhat.com>
9
Reviewed-by: Eric Blake <eblake@redhat.com>
4
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
10
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
5
Reviewed-by: Alberto Garcia <berto@igalia.com>
11
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
6
Message-id: 20180421132929.21610-8-mreitz@redhat.com
12
Reviewed-by: Lukáš Doktor <ldoktor@redhat.com>
7
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
8
Signed-off-by: Max Reitz <mreitz@redhat.com>
9
---
13
---
10
tests/qemu-iotests/197 | 1 +
14
tests/qemu-iotests/197 | 4 ++++
11
1 file changed, 1 insertion(+)
15
tests/qemu-iotests/common.filter | 3 ++-
16
2 files changed, 6 insertions(+), 1 deletion(-)
12
17
13
diff --git a/tests/qemu-iotests/197 b/tests/qemu-iotests/197
18
diff --git a/tests/qemu-iotests/197 b/tests/qemu-iotests/197
14
index XXXXXXX..XXXXXXX 100755
19
index XXXXXXX..XXXXXXX 100755
15
--- a/tests/qemu-iotests/197
20
--- a/tests/qemu-iotests/197
16
+++ b/tests/qemu-iotests/197
21
+++ b/tests/qemu-iotests/197
17
@@ -XXX,XX +XXX,XX @@ esac
22
@@ -XXX,XX +XXX,XX @@ echo '=== Copy-on-read ==='
18
_cleanup()
23
echo
19
{
24
20
_cleanup_test_img
25
# Prep the images
21
+ rm -f "$TEST_WRAP"
26
+# VPC rounds image sizes to a specific geometry, force a specific size.
22
rm -f "$BLKDBG_CONF"
27
+if [ "$IMGFMT" = "vpc" ]; then
28
+ IMGOPTS=$(_optstr_add "$IMGOPTS" "force_size")
29
+fi
30
_make_test_img 4G
31
$QEMU_IO -c "write -P 55 3G 1k" "$TEST_IMG" | _filter_qemu_io
32
IMGPROTO=file IMGFMT=qcow2 IMGOPTS= TEST_IMG_FILE="$TEST_WRAP" \
33
diff --git a/tests/qemu-iotests/common.filter b/tests/qemu-iotests/common.filter
34
index XXXXXXX..XXXXXXX 100644
35
--- a/tests/qemu-iotests/common.filter
36
+++ b/tests/qemu-iotests/common.filter
37
@@ -XXX,XX +XXX,XX @@ _filter_img_create()
38
-e "s# log_size=[0-9]\\+##g" \
39
-e "s# refcount_bits=[0-9]\\+##g" \
40
-e "s# key-secret=[a-zA-Z0-9]\\+##g" \
41
- -e "s# iter-time=[0-9]\\+##g"
42
+ -e "s# iter-time=[0-9]\\+##g" \
43
+ -e "s# force_size=\\(on\\|off\\)##g"
23
}
44
}
24
trap "_cleanup; exit \$status" 0 1 2 3 15
45
46
_filter_img_info()
25
--
47
--
26
2.13.6
48
2.13.6
27
49
28
50
diff view generated by jsdifflib
1
Every job gets a non-NULL job->txn on creation, but it doesn't
1
This change separates bdrv_drain_invoke(), which calls the BlockDriver
2
necessarily keep it until it is decommissioned: Finalising a job removes
2
drain callbacks, from bdrv_drain_recurse(). Instead, the function
3
it from its transaction. Therefore, calling 'blockdev-job-finalize' a
3
performs its own recursion now.
4
second time on an already concluded job causes an assertion failure.
5
4
6
Remove job->txn from the assertion in block_job_finalize() to fix this.
5
One reason for this is that bdrv_drain_recurse() can be called multiple
7
block_job_do_finalize() still has the same assertion, but if a job is
6
times by bdrv_drain_all_begin(), but the callbacks may only be called
8
already removed from its transaction, block_job_apply_verb() will
7
once. The separation is necessary to fix this bug.
9
already error out before we run into that assertion.
8
9
The other reason is that we intend to go to a model where we call all
10
driver callbacks first, and only then start polling. This is not fully
11
achieved yet with this patch, as bdrv_drain_invoke() contains a
12
BDRV_POLL_WHILE() loop for the block driver callbacks, which can still
13
call callbacks for any unrelated event. It's a step in this direction
14
anyway.
10
15
11
Cc: qemu-stable@nongnu.org
16
Cc: qemu-stable@nongnu.org
12
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
17
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
13
Reviewed-by: Eric Blake <eblake@redhat.com>
18
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
14
Reviewed-by: Max Reitz <mreitz@redhat.com>
15
Reviewed-by: John Snow <jsnow@redhat.com>
16
---
19
---
17
blockjob.c | 2 +-
20
block/io.c | 14 +++++++++++---
18
1 file changed, 1 insertion(+), 1 deletion(-)
21
1 file changed, 11 insertions(+), 3 deletions(-)
19
22
20
diff --git a/blockjob.c b/blockjob.c
23
diff --git a/block/io.c b/block/io.c
21
index XXXXXXX..XXXXXXX 100644
24
index XXXXXXX..XXXXXXX 100644
22
--- a/blockjob.c
25
--- a/block/io.c
23
+++ b/blockjob.c
26
+++ b/block/io.c
24
@@ -XXX,XX +XXX,XX @@ void block_job_complete(BlockJob *job, Error **errp)
27
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn bdrv_drain_invoke_entry(void *opaque)
25
28
bdrv_wakeup(bs);
26
void block_job_finalize(BlockJob *job, Error **errp)
29
}
30
31
+/* Recursively call BlockDriver.bdrv_co_drain_begin/end callbacks */
32
static void bdrv_drain_invoke(BlockDriverState *bs, bool begin)
27
{
33
{
28
- assert(job && job->id && job->txn);
34
+ BdrvChild *child, *tmp;
29
+ assert(job && job->id);
35
BdrvCoDrainData data = { .bs = bs, .done = false, .begin = begin};
30
if (block_job_apply_verb(job, BLOCK_JOB_VERB_FINALIZE, errp)) {
36
31
return;
37
if (!bs->drv || (begin && !bs->drv->bdrv_co_drain_begin) ||
38
@@ -XXX,XX +XXX,XX @@ static void bdrv_drain_invoke(BlockDriverState *bs, bool begin)
39
data.co = qemu_coroutine_create(bdrv_drain_invoke_entry, &data);
40
bdrv_coroutine_enter(bs, data.co);
41
BDRV_POLL_WHILE(bs, !data.done);
42
+
43
+ QLIST_FOREACH_SAFE(child, &bs->children, next, tmp) {
44
+ bdrv_drain_invoke(child->bs, begin);
45
+ }
46
}
47
48
static bool bdrv_drain_recurse(BlockDriverState *bs, bool begin)
49
@@ -XXX,XX +XXX,XX @@ static bool bdrv_drain_recurse(BlockDriverState *bs, bool begin)
50
BdrvChild *child, *tmp;
51
bool waited;
52
53
- /* Ensure any pending metadata writes are submitted to bs->file. */
54
- bdrv_drain_invoke(bs, begin);
55
-
56
/* Wait for drained requests to finish */
57
waited = BDRV_POLL_WHILE(bs, atomic_read(&bs->in_flight) > 0);
58
59
@@ -XXX,XX +XXX,XX @@ void bdrv_drained_begin(BlockDriverState *bs)
60
bdrv_parent_drained_begin(bs);
61
}
62
63
+ bdrv_drain_invoke(bs, true);
64
bdrv_drain_recurse(bs, true);
65
}
66
67
@@ -XXX,XX +XXX,XX @@ void bdrv_drained_end(BlockDriverState *bs)
68
}
69
70
bdrv_parent_drained_end(bs);
71
+ bdrv_drain_invoke(bs, false);
72
bdrv_drain_recurse(bs, false);
73
aio_enable_external(bdrv_get_aio_context(bs));
74
}
75
@@ -XXX,XX +XXX,XX @@ void bdrv_drain_all_begin(void)
76
aio_context_acquire(aio_context);
77
for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
78
if (aio_context == bdrv_get_aio_context(bs)) {
79
+ /* FIXME Calling this multiple times is wrong */
80
+ bdrv_drain_invoke(bs, true);
81
waited |= bdrv_drain_recurse(bs, true);
82
}
83
}
84
@@ -XXX,XX +XXX,XX @@ void bdrv_drain_all_end(void)
85
aio_context_acquire(aio_context);
86
aio_enable_external(aio_context);
87
bdrv_parent_drained_end(bs);
88
+ bdrv_drain_invoke(bs, false);
89
bdrv_drain_recurse(bs, false);
90
aio_context_release(aio_context);
32
}
91
}
33
--
92
--
34
2.13.6
93
2.13.6
35
94
36
95
diff view generated by jsdifflib
1
From: Max Reitz <mreitz@redhat.com>
1
bdrv_drain_all_begin() used to call the .bdrv_co_drain_begin() driver
2
callback inside its polling loop. This means that how many times it got
3
called for each node depended on long it had to poll the event loop.
2
4
3
img_open_opts() takes a QemuOpts and converts them to a QDict, so all
5
This is obviously not right and results in nodes that stay drained even
4
values therein are strings. Then it may try to call qdict_get_bool(),
6
after bdrv_drain_all_end(), which calls .bdrv_co_drain_begin() once per
5
however, which will fail with a segmentation fault every time:
7
node.
6
8
7
$ ./qemu-img info -U --image-opts \
9
Fix bdrv_drain_all_begin() to call the callback only once, too.
8
driver=file,filename=/dev/null,force-share=off
9
[1] 27869 segmentation fault (core dumped) ./qemu-img info -U
10
--image-opts driver=file,filename=/dev/null,force-share=off
11
12
Fix this by using qdict_get_str() and comparing the value as a string.
13
Also, when adding a force-share value to the QDict, add it as a string
14
so it fits the rest of the dict.
15
10
16
Cc: qemu-stable@nongnu.org
11
Cc: qemu-stable@nongnu.org
17
Signed-off-by: Max Reitz <mreitz@redhat.com>
12
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
18
Message-id: 20180502202051.15493-3-mreitz@redhat.com
13
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
19
Reviewed-by: Eric Blake <eblake@redhat.com>
20
Signed-off-by: Max Reitz <mreitz@redhat.com>
21
---
14
---
22
qemu-img.c | 4 ++--
15
block/io.c | 3 +--
23
1 file changed, 2 insertions(+), 2 deletions(-)
16
1 file changed, 1 insertion(+), 2 deletions(-)
24
17
25
diff --git a/qemu-img.c b/qemu-img.c
18
diff --git a/block/io.c b/block/io.c
26
index XXXXXXX..XXXXXXX 100644
19
index XXXXXXX..XXXXXXX 100644
27
--- a/qemu-img.c
20
--- a/block/io.c
28
+++ b/qemu-img.c
21
+++ b/block/io.c
29
@@ -XXX,XX +XXX,XX @@ static BlockBackend *img_open_opts(const char *optstr,
22
@@ -XXX,XX +XXX,XX @@ void bdrv_drain_all_begin(void)
30
options = qemu_opts_to_qdict(opts, NULL);
23
aio_context_acquire(aio_context);
31
if (force_share) {
24
bdrv_parent_drained_begin(bs);
32
if (qdict_haskey(options, BDRV_OPT_FORCE_SHARE)
25
aio_disable_external(aio_context);
33
- && !qdict_get_bool(options, BDRV_OPT_FORCE_SHARE)) {
26
+ bdrv_drain_invoke(bs, true);
34
+ && strcmp(qdict_get_str(options, BDRV_OPT_FORCE_SHARE), "on")) {
27
aio_context_release(aio_context);
35
error_report("--force-share/-U conflicts with image options");
28
36
qobject_unref(options);
29
if (!g_slist_find(aio_ctxs, aio_context)) {
37
return NULL;
30
@@ -XXX,XX +XXX,XX @@ void bdrv_drain_all_begin(void)
38
}
31
aio_context_acquire(aio_context);
39
- qdict_put_bool(options, BDRV_OPT_FORCE_SHARE, true);
32
for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
40
+ qdict_put_str(options, BDRV_OPT_FORCE_SHARE, "on");
33
if (aio_context == bdrv_get_aio_context(bs)) {
41
}
34
- /* FIXME Calling this multiple times is wrong */
42
blk = blk_new_open(NULL, NULL, options, flags, &local_err);
35
- bdrv_drain_invoke(bs, true);
43
if (!blk) {
36
waited |= bdrv_drain_recurse(bs, true);
37
}
38
}
44
--
39
--
45
2.13.6
40
2.13.6
46
41
47
42
diff view generated by jsdifflib
1
From: Max Reitz <mreitz@redhat.com>
1
This adds a test case that the BlockDriver callbacks for drain are
2
called in bdrv_drained_all_begin/end(), and that both of them are called
3
exactly once.
2
4
3
This adds a simple copy-on-read filter driver. It relies on the already
5
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
4
existing COR functionality in the central block layer code, which may be
6
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
5
moved here once we no longer need it there.
7
Reviewed-by: Eric Blake <eblake@redhat.com>
8
---
9
tests/test-bdrv-drain.c | 137 ++++++++++++++++++++++++++++++++++++++++++++++++
10
tests/Makefile.include | 2 +
11
2 files changed, 139 insertions(+)
12
create mode 100644 tests/test-bdrv-drain.c
6
13
7
Signed-off-by: Max Reitz <mreitz@redhat.com>
14
diff --git a/tests/test-bdrv-drain.c b/tests/test-bdrv-drain.c
8
Message-id: 20180421132929.21610-2-mreitz@redhat.com
9
Reviewed-by: Alberto Garcia <berto@igalia.com>
10
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
11
Signed-off-by: Max Reitz <mreitz@redhat.com>
12
---
13
qapi/block-core.json | 5 +-
14
block/copy-on-read.c | 171 +++++++++++++++++++++++++++++++++++++++++++++++++++
15
block/Makefile.objs | 2 +-
16
3 files changed, 176 insertions(+), 2 deletions(-)
17
create mode 100644 block/copy-on-read.c
18
19
diff --git a/qapi/block-core.json b/qapi/block-core.json
20
index XXXXXXX..XXXXXXX 100644
21
--- a/qapi/block-core.json
22
+++ b/qapi/block-core.json
23
@@ -XXX,XX +XXX,XX @@
24
# @vxhs: Since 2.10
25
# @throttle: Since 2.11
26
# @nvme: Since 2.12
27
+# @copy-on-read: Since 2.13
28
#
29
# Since: 2.9
30
##
31
{ 'enum': 'BlockdevDriver',
32
- 'data': [ 'blkdebug', 'blkverify', 'bochs', 'cloop',
33
+ 'data': [ 'blkdebug', 'blkverify', 'bochs', 'cloop', 'copy-on-read',
34
'dmg', 'file', 'ftp', 'ftps', 'gluster', 'host_cdrom',
35
'host_device', 'http', 'https', 'iscsi', 'luks', 'nbd', 'nfs',
36
'null-aio', 'null-co', 'nvme', 'parallels', 'qcow', 'qcow2', 'qed',
37
@@ -XXX,XX +XXX,XX @@
38
'blkverify': 'BlockdevOptionsBlkverify',
39
'bochs': 'BlockdevOptionsGenericFormat',
40
'cloop': 'BlockdevOptionsGenericFormat',
41
+ 'copy-on-read':'BlockdevOptionsGenericFormat',
42
'dmg': 'BlockdevOptionsGenericFormat',
43
'file': 'BlockdevOptionsFile',
44
'ftp': 'BlockdevOptionsCurlFtp',
45
@@ -XXX,XX +XXX,XX @@
46
'blkverify': 'BlockdevCreateNotSupported',
47
'bochs': 'BlockdevCreateNotSupported',
48
'cloop': 'BlockdevCreateNotSupported',
49
+ 'copy-on-read': 'BlockdevCreateNotSupported',
50
'dmg': 'BlockdevCreateNotSupported',
51
'file': 'BlockdevCreateOptionsFile',
52
'ftp': 'BlockdevCreateNotSupported',
53
diff --git a/block/copy-on-read.c b/block/copy-on-read.c
54
new file mode 100644
15
new file mode 100644
55
index XXXXXXX..XXXXXXX
16
index XXXXXXX..XXXXXXX
56
--- /dev/null
17
--- /dev/null
57
+++ b/block/copy-on-read.c
18
+++ b/tests/test-bdrv-drain.c
58
@@ -XXX,XX +XXX,XX @@
19
@@ -XXX,XX +XXX,XX @@
59
+/*
20
+/*
60
+ * Copy-on-read filter block driver
21
+ * Block node draining tests
61
+ *
22
+ *
62
+ * Copyright (c) 2018 Red Hat, Inc.
23
+ * Copyright (c) 2017 Kevin Wolf <kwolf@redhat.com>
63
+ *
24
+ *
64
+ * Author:
25
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
65
+ * Max Reitz <mreitz@redhat.com>
26
+ * of this software and associated documentation files (the "Software"), to deal
27
+ * in the Software without restriction, including without limitation the rights
28
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
29
+ * copies of the Software, and to permit persons to whom the Software is
30
+ * furnished to do so, subject to the following conditions:
66
+ *
31
+ *
67
+ * This program is free software; you can redistribute it and/or
32
+ * The above copyright notice and this permission notice shall be included in
68
+ * modify it under the terms of the GNU General Public License as
33
+ * all copies or substantial portions of the Software.
69
+ * published by the Free Software Foundation; either version 2 or
70
+ * (at your option) version 3 of the License.
71
+ *
34
+ *
72
+ * This program is distributed in the hope that it will be useful,
35
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
73
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
36
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
74
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
37
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
75
+ * GNU General Public License for more details.
38
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
76
+ *
39
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
77
+ * You should have received a copy of the GNU General Public License
40
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
78
+ * along with this program; if not, see <http://www.gnu.org/licenses/>.
41
+ * THE SOFTWARE.
79
+ */
42
+ */
80
+
43
+
81
+#include "qemu/osdep.h"
44
+#include "qemu/osdep.h"
82
+#include "block/block_int.h"
45
+#include "block/block.h"
46
+#include "sysemu/block-backend.h"
47
+#include "qapi/error.h"
83
+
48
+
49
+typedef struct BDRVTestState {
50
+ int drain_count;
51
+} BDRVTestState;
84
+
52
+
85
+static int cor_open(BlockDriverState *bs, QDict *options, int flags,
53
+static void coroutine_fn bdrv_test_co_drain_begin(BlockDriverState *bs)
86
+ Error **errp)
87
+{
54
+{
88
+ bs->file = bdrv_open_child(NULL, options, "file", bs, &child_file, false,
55
+ BDRVTestState *s = bs->opaque;
89
+ errp);
56
+ s->drain_count++;
90
+ if (!bs->file) {
57
+}
91
+ return -EINVAL;
92
+ }
93
+
58
+
94
+ bs->supported_write_flags = BDRV_REQ_FUA &
59
+static void coroutine_fn bdrv_test_co_drain_end(BlockDriverState *bs)
95
+ bs->file->bs->supported_write_flags;
60
+{
61
+ BDRVTestState *s = bs->opaque;
62
+ s->drain_count--;
63
+}
96
+
64
+
97
+ bs->supported_zero_flags = (BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP) &
65
+static void bdrv_test_close(BlockDriverState *bs)
98
+ bs->file->bs->supported_zero_flags;
66
+{
67
+ BDRVTestState *s = bs->opaque;
68
+ g_assert_cmpint(s->drain_count, >, 0);
69
+}
70
+
71
+static int coroutine_fn bdrv_test_co_preadv(BlockDriverState *bs,
72
+ uint64_t offset, uint64_t bytes,
73
+ QEMUIOVector *qiov, int flags)
74
+{
75
+ /* We want this request to stay until the polling loop in drain waits for
76
+ * it to complete. We need to sleep a while as bdrv_drain_invoke() comes
77
+ * first and polls its result, too, but it shouldn't accidentally complete
78
+ * this request yet. */
79
+ qemu_co_sleep_ns(QEMU_CLOCK_REALTIME, 100000);
99
+
80
+
100
+ return 0;
81
+ return 0;
101
+}
82
+}
102
+
83
+
84
+static BlockDriver bdrv_test = {
85
+ .format_name = "test",
86
+ .instance_size = sizeof(BDRVTestState),
103
+
87
+
104
+static void cor_close(BlockDriverState *bs)
88
+ .bdrv_close = bdrv_test_close,
89
+ .bdrv_co_preadv = bdrv_test_co_preadv,
90
+
91
+ .bdrv_co_drain_begin = bdrv_test_co_drain_begin,
92
+ .bdrv_co_drain_end = bdrv_test_co_drain_end,
93
+};
94
+
95
+static void aio_ret_cb(void *opaque, int ret)
105
+{
96
+{
97
+ int *aio_ret = opaque;
98
+ *aio_ret = ret;
106
+}
99
+}
107
+
100
+
101
+static void test_drv_cb_drain_all(void)
102
+{
103
+ BlockBackend *blk;
104
+ BlockDriverState *bs;
105
+ BDRVTestState *s;
106
+ BlockAIOCB *acb;
107
+ int aio_ret;
108
+
108
+
109
+#define PERM_PASSTHROUGH (BLK_PERM_CONSISTENT_READ \
109
+ QEMUIOVector qiov;
110
+ | BLK_PERM_WRITE \
110
+ struct iovec iov = {
111
+ | BLK_PERM_RESIZE)
111
+ .iov_base = NULL,
112
+#define PERM_UNCHANGED (BLK_PERM_ALL & ~PERM_PASSTHROUGH)
112
+ .iov_len = 0,
113
+ };
114
+ qemu_iovec_init_external(&qiov, &iov, 1);
113
+
115
+
114
+static void cor_child_perm(BlockDriverState *bs, BdrvChild *c,
116
+ blk = blk_new(BLK_PERM_ALL, BLK_PERM_ALL);
115
+ const BdrvChildRole *role,
117
+ bs = bdrv_new_open_driver(&bdrv_test, "test-node", BDRV_O_RDWR,
116
+ BlockReopenQueue *reopen_queue,
118
+ &error_abort);
117
+ uint64_t perm, uint64_t shared,
119
+ s = bs->opaque;
118
+ uint64_t *nperm, uint64_t *nshared)
120
+ blk_insert_bs(blk, bs, &error_abort);
119
+{
120
+ if (c == NULL) {
121
+ *nperm = (perm & PERM_PASSTHROUGH) | BLK_PERM_WRITE_UNCHANGED;
122
+ *nshared = (shared & PERM_PASSTHROUGH) | PERM_UNCHANGED;
123
+ return;
124
+ }
125
+
121
+
126
+ *nperm = (perm & PERM_PASSTHROUGH) |
122
+ /* Simple bdrv_drain_all_begin/end pair, check that CBs are called */
127
+ (c->perm & PERM_UNCHANGED);
123
+ g_assert_cmpint(s->drain_count, ==, 0);
128
+ *nshared = (shared & PERM_PASSTHROUGH) |
124
+ bdrv_drain_all_begin();
129
+ (c->shared_perm & PERM_UNCHANGED);
125
+ g_assert_cmpint(s->drain_count, ==, 1);
126
+ bdrv_drain_all_end();
127
+ g_assert_cmpint(s->drain_count, ==, 0);
128
+
129
+ /* Now do the same while a request is pending */
130
+ aio_ret = -EINPROGRESS;
131
+ acb = blk_aio_preadv(blk, 0, &qiov, 0, aio_ret_cb, &aio_ret);
132
+ g_assert(acb != NULL);
133
+ g_assert_cmpint(aio_ret, ==, -EINPROGRESS);
134
+
135
+ g_assert_cmpint(s->drain_count, ==, 0);
136
+ bdrv_drain_all_begin();
137
+ g_assert_cmpint(aio_ret, ==, 0);
138
+ g_assert_cmpint(s->drain_count, ==, 1);
139
+ bdrv_drain_all_end();
140
+ g_assert_cmpint(s->drain_count, ==, 0);
141
+
142
+ bdrv_unref(bs);
143
+ blk_unref(blk);
130
+}
144
+}
131
+
145
+
146
+int main(int argc, char **argv)
147
+{
148
+ bdrv_init();
149
+ qemu_init_main_loop(&error_abort);
132
+
150
+
133
+static int64_t cor_getlength(BlockDriverState *bs)
151
+ g_test_init(&argc, &argv, NULL);
134
+{
152
+
135
+ return bdrv_getlength(bs->file->bs);
153
+ g_test_add_func("/bdrv-drain/driver-cb/drain_all", test_drv_cb_drain_all);
154
+
155
+ return g_test_run();
136
+}
156
+}
137
+
157
diff --git a/tests/Makefile.include b/tests/Makefile.include
138
+
139
+static int cor_truncate(BlockDriverState *bs, int64_t offset,
140
+ PreallocMode prealloc, Error **errp)
141
+{
142
+ return bdrv_truncate(bs->file, offset, prealloc, errp);
143
+}
144
+
145
+
146
+static int coroutine_fn cor_co_preadv(BlockDriverState *bs,
147
+ uint64_t offset, uint64_t bytes,
148
+ QEMUIOVector *qiov, int flags)
149
+{
150
+ return bdrv_co_preadv(bs->file, offset, bytes, qiov,
151
+ flags | BDRV_REQ_COPY_ON_READ);
152
+}
153
+
154
+
155
+static int coroutine_fn cor_co_pwritev(BlockDriverState *bs,
156
+ uint64_t offset, uint64_t bytes,
157
+ QEMUIOVector *qiov, int flags)
158
+{
159
+
160
+ return bdrv_co_pwritev(bs->file, offset, bytes, qiov, flags);
161
+}
162
+
163
+
164
+static int coroutine_fn cor_co_pwrite_zeroes(BlockDriverState *bs,
165
+ int64_t offset, int bytes,
166
+ BdrvRequestFlags flags)
167
+{
168
+ return bdrv_co_pwrite_zeroes(bs->file, offset, bytes, flags);
169
+}
170
+
171
+
172
+static int coroutine_fn cor_co_pdiscard(BlockDriverState *bs,
173
+ int64_t offset, int bytes)
174
+{
175
+ return bdrv_co_pdiscard(bs->file->bs, offset, bytes);
176
+}
177
+
178
+
179
+static void cor_eject(BlockDriverState *bs, bool eject_flag)
180
+{
181
+ bdrv_eject(bs->file->bs, eject_flag);
182
+}
183
+
184
+
185
+static void cor_lock_medium(BlockDriverState *bs, bool locked)
186
+{
187
+ bdrv_lock_medium(bs->file->bs, locked);
188
+}
189
+
190
+
191
+static bool cor_recurse_is_first_non_filter(BlockDriverState *bs,
192
+ BlockDriverState *candidate)
193
+{
194
+ return bdrv_recurse_is_first_non_filter(bs->file->bs, candidate);
195
+}
196
+
197
+
198
+BlockDriver bdrv_copy_on_read = {
199
+ .format_name = "copy-on-read",
200
+
201
+ .bdrv_open = cor_open,
202
+ .bdrv_close = cor_close,
203
+ .bdrv_child_perm = cor_child_perm,
204
+
205
+ .bdrv_getlength = cor_getlength,
206
+ .bdrv_truncate = cor_truncate,
207
+
208
+ .bdrv_co_preadv = cor_co_preadv,
209
+ .bdrv_co_pwritev = cor_co_pwritev,
210
+ .bdrv_co_pwrite_zeroes = cor_co_pwrite_zeroes,
211
+ .bdrv_co_pdiscard = cor_co_pdiscard,
212
+
213
+ .bdrv_eject = cor_eject,
214
+ .bdrv_lock_medium = cor_lock_medium,
215
+
216
+ .bdrv_co_block_status = bdrv_co_block_status_from_file,
217
+
218
+ .bdrv_recurse_is_first_non_filter = cor_recurse_is_first_non_filter,
219
+
220
+ .has_variable_length = true,
221
+ .is_filter = true,
222
+};
223
+
224
+static void bdrv_copy_on_read_init(void)
225
+{
226
+ bdrv_register(&bdrv_copy_on_read);
227
+}
228
+
229
+block_init(bdrv_copy_on_read_init);
230
diff --git a/block/Makefile.objs b/block/Makefile.objs
231
index XXXXXXX..XXXXXXX 100644
158
index XXXXXXX..XXXXXXX 100644
232
--- a/block/Makefile.objs
159
--- a/tests/Makefile.include
233
+++ b/block/Makefile.objs
160
+++ b/tests/Makefile.include
234
@@ -XXX,XX +XXX,XX @@ block-obj-y += accounting.o dirty-bitmap.o
161
@@ -XXX,XX +XXX,XX @@ gcov-files-test-thread-pool-y = thread-pool.c
235
block-obj-y += write-threshold.o
162
gcov-files-test-hbitmap-y = util/hbitmap.c
236
block-obj-y += backup.o
163
check-unit-y += tests/test-hbitmap$(EXESUF)
237
block-obj-$(CONFIG_REPLICATION) += replication.o
164
gcov-files-test-hbitmap-y = blockjob.c
238
-block-obj-y += throttle.o
165
+check-unit-y += tests/test-bdrv-drain$(EXESUF)
239
+block-obj-y += throttle.o copy-on-read.o
166
check-unit-y += tests/test-blockjob$(EXESUF)
240
167
check-unit-y += tests/test-blockjob-txn$(EXESUF)
241
block-obj-y += crypto.o
168
check-unit-y += tests/test-x86-cpuid$(EXESUF)
242
169
@@ -XXX,XX +XXX,XX @@ tests/test-coroutine$(EXESUF): tests/test-coroutine.o $(test-block-obj-y)
170
tests/test-aio$(EXESUF): tests/test-aio.o $(test-block-obj-y)
171
tests/test-aio-multithread$(EXESUF): tests/test-aio-multithread.o $(test-block-obj-y)
172
tests/test-throttle$(EXESUF): tests/test-throttle.o $(test-block-obj-y)
173
+tests/test-bdrv-drain$(EXESUF): tests/test-bdrv-drain.o $(test-block-obj-y) $(test-util-obj-y)
174
tests/test-blockjob$(EXESUF): tests/test-blockjob.o $(test-block-obj-y) $(test-util-obj-y)
175
tests/test-blockjob-txn$(EXESUF): tests/test-blockjob-txn.o $(test-block-obj-y) $(test-util-obj-y)
176
tests/test-thread-pool$(EXESUF): tests/test-thread-pool.o $(test-block-obj-y)
243
--
177
--
244
2.13.6
178
2.13.6
245
179
246
180
diff view generated by jsdifflib
1
From: Eric Blake <eblake@redhat.com>
1
Now that the bdrv_drain_invoke() calls are pulled up to the callers of
2
bdrv_drain_recurse(), the 'begin' parameter isn't needed any more.
2
3
3
We have too many driver callback interfaces; simplify the mess
4
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
4
somewhat by merging the flags parameter of .bdrv_co_writev_flags()
5
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
5
into .bdrv_co_writev(). Note that as long as a driver doesn't set
6
---
6
.supported_write_flags, the flags argument will be 0 and behavior is
7
block/io.c | 12 ++++++------
7
identical. Also note that the public function bdrv_co_writev() still
8
1 file changed, 6 insertions(+), 6 deletions(-)
8
lacks a flags argument; so the driver signature is thus intentionally
9
slightly different. But that's not the end of the world, nor the first
10
time that the driver interface differs slightly from the public
11
interface.
12
9
13
Ideally, we should be rewriting all of these drivers to use modern
14
byte-based interfaces. But that's a more invasive patch to write
15
and audit, compared to the simplification done here.
16
17
Signed-off-by: Eric Blake <eblake@redhat.com>
18
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
19
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
20
---
21
include/block/block_int.h | 2 --
22
block/gluster.c | 4 +++-
23
block/io.c | 13 ++++---------
24
block/iscsi.c | 8 ++++----
25
block/parallels.c | 4 +++-
26
block/qcow.c | 6 ++++--
27
block/qed.c | 3 ++-
28
block/replication.c | 4 +++-
29
block/sheepdog.c | 4 +++-
30
block/ssh.c | 4 +++-
31
block/vhdx.c | 4 +++-
32
11 files changed, 32 insertions(+), 24 deletions(-)
33
34
diff --git a/include/block/block_int.h b/include/block/block_int.h
35
index XXXXXXX..XXXXXXX 100644
36
--- a/include/block/block_int.h
37
+++ b/include/block/block_int.h
38
@@ -XXX,XX +XXX,XX @@ struct BlockDriver {
39
int coroutine_fn (*bdrv_co_preadv)(BlockDriverState *bs,
40
uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags);
41
int coroutine_fn (*bdrv_co_writev)(BlockDriverState *bs,
42
- int64_t sector_num, int nb_sectors, QEMUIOVector *qiov);
43
- int coroutine_fn (*bdrv_co_writev_flags)(BlockDriverState *bs,
44
int64_t sector_num, int nb_sectors, QEMUIOVector *qiov, int flags);
45
/**
46
* @offset: position in bytes to write at
47
diff --git a/block/gluster.c b/block/gluster.c
48
index XXXXXXX..XXXXXXX 100644
49
--- a/block/gluster.c
50
+++ b/block/gluster.c
51
@@ -XXX,XX +XXX,XX @@ static coroutine_fn int qemu_gluster_co_readv(BlockDriverState *bs,
52
static coroutine_fn int qemu_gluster_co_writev(BlockDriverState *bs,
53
int64_t sector_num,
54
int nb_sectors,
55
- QEMUIOVector *qiov)
56
+ QEMUIOVector *qiov,
57
+ int flags)
58
{
59
+ assert(!flags);
60
return qemu_gluster_co_rw(bs, sector_num, nb_sectors, qiov, 1);
61
}
62
63
diff --git a/block/io.c b/block/io.c
10
diff --git a/block/io.c b/block/io.c
64
index XXXXXXX..XXXXXXX 100644
11
index XXXXXXX..XXXXXXX 100644
65
--- a/block/io.c
12
--- a/block/io.c
66
+++ b/block/io.c
13
+++ b/block/io.c
67
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn bdrv_driver_pwritev(BlockDriverState *bs,
14
@@ -XXX,XX +XXX,XX @@ static void bdrv_drain_invoke(BlockDriverState *bs, bool begin)
68
assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
15
}
69
assert((bytes >> BDRV_SECTOR_BITS) <= BDRV_REQUEST_MAX_SECTORS);
70
71
- if (drv->bdrv_co_writev_flags) {
72
- ret = drv->bdrv_co_writev_flags(bs, sector_num, nb_sectors, qiov,
73
- flags & bs->supported_write_flags);
74
- flags &= ~bs->supported_write_flags;
75
- } else {
76
- assert(drv->bdrv_co_writev);
77
- assert(!bs->supported_write_flags);
78
- ret = drv->bdrv_co_writev(bs, sector_num, nb_sectors, qiov);
79
- }
80
+ assert(drv->bdrv_co_writev);
81
+ ret = drv->bdrv_co_writev(bs, sector_num, nb_sectors, qiov,
82
+ flags & bs->supported_write_flags);
83
+ flags &= ~bs->supported_write_flags;
84
85
emulate_flags:
86
if (ret == 0 && (flags & BDRV_REQ_FUA)) {
87
diff --git a/block/iscsi.c b/block/iscsi.c
88
index XXXXXXX..XXXXXXX 100644
89
--- a/block/iscsi.c
90
+++ b/block/iscsi.c
91
@@ -XXX,XX +XXX,XX @@ static inline bool iscsi_allocmap_is_valid(IscsiLun *iscsilun,
92
}
16
}
93
17
94
static int coroutine_fn
18
-static bool bdrv_drain_recurse(BlockDriverState *bs, bool begin)
95
-iscsi_co_writev_flags(BlockDriverState *bs, int64_t sector_num, int nb_sectors,
19
+static bool bdrv_drain_recurse(BlockDriverState *bs)
96
- QEMUIOVector *iov, int flags)
97
+iscsi_co_writev(BlockDriverState *bs, int64_t sector_num, int nb_sectors,
98
+ QEMUIOVector *iov, int flags)
99
{
20
{
100
IscsiLun *iscsilun = bs->opaque;
21
BdrvChild *child, *tmp;
101
struct IscsiTask iTask;
22
bool waited;
102
@@ -XXX,XX +XXX,XX @@ static BlockDriver bdrv_iscsi = {
23
@@ -XXX,XX +XXX,XX @@ static bool bdrv_drain_recurse(BlockDriverState *bs, bool begin)
103
.bdrv_co_pdiscard = iscsi_co_pdiscard,
24
*/
104
.bdrv_co_pwrite_zeroes = iscsi_co_pwrite_zeroes,
25
bdrv_ref(bs);
105
.bdrv_co_readv = iscsi_co_readv,
26
}
106
- .bdrv_co_writev_flags = iscsi_co_writev_flags,
27
- waited |= bdrv_drain_recurse(bs, begin);
107
+ .bdrv_co_writev = iscsi_co_writev,
28
+ waited |= bdrv_drain_recurse(bs);
108
.bdrv_co_flush_to_disk = iscsi_co_flush,
29
if (in_main_loop) {
109
30
bdrv_unref(bs);
110
#ifdef __linux__
31
}
111
@@ -XXX,XX +XXX,XX @@ static BlockDriver bdrv_iser = {
32
@@ -XXX,XX +XXX,XX @@ void bdrv_drained_begin(BlockDriverState *bs)
112
.bdrv_co_pdiscard = iscsi_co_pdiscard,
33
}
113
.bdrv_co_pwrite_zeroes = iscsi_co_pwrite_zeroes,
34
114
.bdrv_co_readv = iscsi_co_readv,
35
bdrv_drain_invoke(bs, true);
115
- .bdrv_co_writev_flags = iscsi_co_writev_flags,
36
- bdrv_drain_recurse(bs, true);
116
+ .bdrv_co_writev = iscsi_co_writev,
37
+ bdrv_drain_recurse(bs);
117
.bdrv_co_flush_to_disk = iscsi_co_flush,
118
119
#ifdef __linux__
120
diff --git a/block/parallels.c b/block/parallels.c
121
index XXXXXXX..XXXXXXX 100644
122
--- a/block/parallels.c
123
+++ b/block/parallels.c
124
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn parallels_co_block_status(BlockDriverState *bs,
125
}
38
}
126
39
127
static coroutine_fn int parallels_co_writev(BlockDriverState *bs,
40
void bdrv_drained_end(BlockDriverState *bs)
128
- int64_t sector_num, int nb_sectors, QEMUIOVector *qiov)
41
@@ -XXX,XX +XXX,XX @@ void bdrv_drained_end(BlockDriverState *bs)
129
+ int64_t sector_num, int nb_sectors,
42
130
+ QEMUIOVector *qiov, int flags)
43
bdrv_parent_drained_end(bs);
131
{
44
bdrv_drain_invoke(bs, false);
132
BDRVParallelsState *s = bs->opaque;
45
- bdrv_drain_recurse(bs, false);
133
uint64_t bytes_done = 0;
46
+ bdrv_drain_recurse(bs);
134
QEMUIOVector hd_qiov;
47
aio_enable_external(bdrv_get_aio_context(bs));
135
int ret = 0;
136
137
+ assert(!flags);
138
qemu_iovec_init(&hd_qiov, qiov->niov);
139
140
while (nb_sectors > 0) {
141
diff --git a/block/qcow.c b/block/qcow.c
142
index XXXXXXX..XXXXXXX 100644
143
--- a/block/qcow.c
144
+++ b/block/qcow.c
145
@@ -XXX,XX +XXX,XX @@ static coroutine_fn int qcow_co_readv(BlockDriverState *bs, int64_t sector_num,
146
}
48
}
147
49
148
static coroutine_fn int qcow_co_writev(BlockDriverState *bs, int64_t sector_num,
50
@@ -XXX,XX +XXX,XX @@ void bdrv_drain_all_begin(void)
149
- int nb_sectors, QEMUIOVector *qiov)
51
aio_context_acquire(aio_context);
150
+ int nb_sectors, QEMUIOVector *qiov,
52
for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
151
+ int flags)
53
if (aio_context == bdrv_get_aio_context(bs)) {
152
{
54
- waited |= bdrv_drain_recurse(bs, true);
153
BDRVQcowState *s = bs->opaque;
55
+ waited |= bdrv_drain_recurse(bs);
154
int index_in_cluster;
56
}
155
@@ -XXX,XX +XXX,XX @@ static coroutine_fn int qcow_co_writev(BlockDriverState *bs, int64_t sector_num,
57
}
156
uint8_t *buf;
58
aio_context_release(aio_context);
157
void *orig_buf;
59
@@ -XXX,XX +XXX,XX @@ void bdrv_drain_all_end(void)
158
60
aio_enable_external(aio_context);
159
+ assert(!flags);
61
bdrv_parent_drained_end(bs);
160
s->cluster_cache_offset = -1; /* disable compressed cache */
62
bdrv_drain_invoke(bs, false);
161
63
- bdrv_drain_recurse(bs, false);
162
/* We must always copy the iov when encrypting, so we
64
+ bdrv_drain_recurse(bs);
163
@@ -XXX,XX +XXX,XX @@ qcow_co_pwritev_compressed(BlockDriverState *bs, uint64_t offset,
65
aio_context_release(aio_context);
164
if (ret != Z_STREAM_END || out_len >= s->cluster_size) {
66
}
165
/* could not compress: write normal cluster */
67
166
ret = qcow_co_writev(bs, offset >> BDRV_SECTOR_BITS,
167
- bytes >> BDRV_SECTOR_BITS, qiov);
168
+ bytes >> BDRV_SECTOR_BITS, qiov, 0);
169
if (ret < 0) {
170
goto fail;
171
}
172
diff --git a/block/qed.c b/block/qed.c
173
index XXXXXXX..XXXXXXX 100644
174
--- a/block/qed.c
175
+++ b/block/qed.c
176
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn bdrv_qed_co_readv(BlockDriverState *bs,
177
178
static int coroutine_fn bdrv_qed_co_writev(BlockDriverState *bs,
179
int64_t sector_num, int nb_sectors,
180
- QEMUIOVector *qiov)
181
+ QEMUIOVector *qiov, int flags)
182
{
183
+ assert(!flags);
184
return qed_co_request(bs, sector_num, qiov, nb_sectors, QED_AIOCB_WRITE);
185
}
186
187
diff --git a/block/replication.c b/block/replication.c
188
index XXXXXXX..XXXXXXX 100644
189
--- a/block/replication.c
190
+++ b/block/replication.c
191
@@ -XXX,XX +XXX,XX @@ out:
192
static coroutine_fn int replication_co_writev(BlockDriverState *bs,
193
int64_t sector_num,
194
int remaining_sectors,
195
- QEMUIOVector *qiov)
196
+ QEMUIOVector *qiov,
197
+ int flags)
198
{
199
BDRVReplicationState *s = bs->opaque;
200
QEMUIOVector hd_qiov;
201
@@ -XXX,XX +XXX,XX @@ static coroutine_fn int replication_co_writev(BlockDriverState *bs,
202
int ret;
203
int64_t n;
204
205
+ assert(!flags);
206
ret = replication_get_io_status(s);
207
if (ret < 0) {
208
goto out;
209
diff --git a/block/sheepdog.c b/block/sheepdog.c
210
index XXXXXXX..XXXXXXX 100644
211
--- a/block/sheepdog.c
212
+++ b/block/sheepdog.c
213
@@ -XXX,XX +XXX,XX @@ static void sd_aio_complete(SheepdogAIOCB *acb)
214
}
215
216
static coroutine_fn int sd_co_writev(BlockDriverState *bs, int64_t sector_num,
217
- int nb_sectors, QEMUIOVector *qiov)
218
+ int nb_sectors, QEMUIOVector *qiov,
219
+ int flags)
220
{
221
SheepdogAIOCB acb;
222
int ret;
223
int64_t offset = (sector_num + nb_sectors) * BDRV_SECTOR_SIZE;
224
BDRVSheepdogState *s = bs->opaque;
225
226
+ assert(!flags);
227
if (offset > s->inode.vdi_size) {
228
ret = sd_truncate(bs, offset, PREALLOC_MODE_OFF, NULL);
229
if (ret < 0) {
230
diff --git a/block/ssh.c b/block/ssh.c
231
index XXXXXXX..XXXXXXX 100644
232
--- a/block/ssh.c
233
+++ b/block/ssh.c
234
@@ -XXX,XX +XXX,XX @@ static int ssh_write(BDRVSSHState *s, BlockDriverState *bs,
235
236
static coroutine_fn int ssh_co_writev(BlockDriverState *bs,
237
int64_t sector_num,
238
- int nb_sectors, QEMUIOVector *qiov)
239
+ int nb_sectors, QEMUIOVector *qiov,
240
+ int flags)
241
{
242
BDRVSSHState *s = bs->opaque;
243
int ret;
244
245
+ assert(!flags);
246
qemu_co_mutex_lock(&s->lock);
247
ret = ssh_write(s, bs, sector_num * BDRV_SECTOR_SIZE,
248
nb_sectors * BDRV_SECTOR_SIZE, qiov);
249
diff --git a/block/vhdx.c b/block/vhdx.c
250
index XXXXXXX..XXXXXXX 100644
251
--- a/block/vhdx.c
252
+++ b/block/vhdx.c
253
@@ -XXX,XX +XXX,XX @@ int vhdx_user_visible_write(BlockDriverState *bs, BDRVVHDXState *s)
254
}
255
256
static coroutine_fn int vhdx_co_writev(BlockDriverState *bs, int64_t sector_num,
257
- int nb_sectors, QEMUIOVector *qiov)
258
+ int nb_sectors, QEMUIOVector *qiov,
259
+ int flags)
260
{
261
int ret = -ENOTSUP;
262
BDRVVHDXState *s = bs->opaque;
263
@@ -XXX,XX +XXX,XX @@ static coroutine_fn int vhdx_co_writev(BlockDriverState *bs, int64_t sector_num,
264
uint64_t bat_prior_offset = 0;
265
bool bat_update = false;
266
267
+ assert(!flags);
268
qemu_iovec_init(&hd_qiov, qiov->niov);
269
270
qemu_co_mutex_lock(&s->lock);
271
--
68
--
272
2.13.6
69
2.13.6
273
70
274
71
diff view generated by jsdifflib
1
From: Max Reitz <mreitz@redhat.com>
1
The device is drained, so there is no point in waiting for requests at
2
the end of the drained section. Remove the bdrv_drain_recurse() calls
3
there.
2
4
3
We just need to forward it to quorum's children (except in case of a
5
The bdrv_drain_recurse() calls were introduced in commit 481cad48e5e
4
rewrite because of corruption), but for that we first have to support
6
in order to call the .bdrv_co_drain_end() driver callback. This is now
5
flags in child requests at all.
7
done by a separate bdrv_drain_invoke() call.
6
8
7
Signed-off-by: Max Reitz <mreitz@redhat.com>
9
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
10
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
8
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
11
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
9
Reviewed-by: Alberto Garcia <berto@igalia.com>
10
Message-id: 20180421132929.21610-6-mreitz@redhat.com
11
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
12
Signed-off-by: Max Reitz <mreitz@redhat.com>
13
---
12
---
14
block/quorum.c | 19 +++++++++++++------
13
block/io.c | 2 --
15
1 file changed, 13 insertions(+), 6 deletions(-)
14
1 file changed, 2 deletions(-)
16
15
17
diff --git a/block/quorum.c b/block/quorum.c
16
diff --git a/block/io.c b/block/io.c
18
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
19
--- a/block/quorum.c
18
--- a/block/io.c
20
+++ b/block/quorum.c
19
+++ b/block/io.c
21
@@ -XXX,XX +XXX,XX @@ struct QuorumAIOCB {
20
@@ -XXX,XX +XXX,XX @@ void bdrv_drained_end(BlockDriverState *bs)
22
/* Request metadata */
21
23
uint64_t offset;
22
bdrv_parent_drained_end(bs);
24
uint64_t bytes;
23
bdrv_drain_invoke(bs, false);
25
+ int flags;
24
- bdrv_drain_recurse(bs);
26
25
aio_enable_external(bdrv_get_aio_context(bs));
27
QEMUIOVector *qiov; /* calling IOV */
26
}
28
27
29
@@ -XXX,XX +XXX,XX @@ static bool quorum_64bits_compare(QuorumVoteValue *a, QuorumVoteValue *b)
28
@@ -XXX,XX +XXX,XX @@ void bdrv_drain_all_end(void)
30
static QuorumAIOCB *quorum_aio_get(BlockDriverState *bs,
29
aio_enable_external(aio_context);
31
QEMUIOVector *qiov,
30
bdrv_parent_drained_end(bs);
32
uint64_t offset,
31
bdrv_drain_invoke(bs, false);
33
- uint64_t bytes)
32
- bdrv_drain_recurse(bs);
34
+ uint64_t bytes,
33
aio_context_release(aio_context);
35
+ int flags)
36
{
37
BDRVQuorumState *s = bs->opaque;
38
QuorumAIOCB *acb = g_new(QuorumAIOCB, 1);
39
@@ -XXX,XX +XXX,XX @@ static QuorumAIOCB *quorum_aio_get(BlockDriverState *bs,
40
.bs = bs,
41
.offset = offset,
42
.bytes = bytes,
43
+ .flags = flags,
44
.qiov = qiov,
45
.votes.compare = quorum_sha256_compare,
46
.votes.vote_list = QLIST_HEAD_INITIALIZER(acb.votes.vote_list),
47
@@ -XXX,XX +XXX,XX @@ static void quorum_rewrite_entry(void *opaque)
48
BDRVQuorumState *s = acb->bs->opaque;
49
50
/* Ignore any errors, it's just a correction attempt for already
51
- * corrupted data. */
52
+ * corrupted data.
53
+ * Mask out BDRV_REQ_WRITE_UNCHANGED because this overwrites the
54
+ * area with different data from the other children. */
55
bdrv_co_pwritev(s->children[co->idx], acb->offset, acb->bytes,
56
- acb->qiov, 0);
57
+ acb->qiov, acb->flags & ~BDRV_REQ_WRITE_UNCHANGED);
58
59
/* Wake up the caller after the last rewrite */
60
acb->rewrite_count--;
61
@@ -XXX,XX +XXX,XX @@ static int quorum_co_preadv(BlockDriverState *bs, uint64_t offset,
62
uint64_t bytes, QEMUIOVector *qiov, int flags)
63
{
64
BDRVQuorumState *s = bs->opaque;
65
- QuorumAIOCB *acb = quorum_aio_get(bs, qiov, offset, bytes);
66
+ QuorumAIOCB *acb = quorum_aio_get(bs, qiov, offset, bytes, flags);
67
int ret;
68
69
acb->is_read = true;
70
@@ -XXX,XX +XXX,XX @@ static void write_quorum_entry(void *opaque)
71
72
sacb->bs = s->children[i]->bs;
73
sacb->ret = bdrv_co_pwritev(s->children[i], acb->offset, acb->bytes,
74
- acb->qiov, 0);
75
+ acb->qiov, acb->flags);
76
if (sacb->ret == 0) {
77
acb->success_count++;
78
} else {
79
@@ -XXX,XX +XXX,XX @@ static int quorum_co_pwritev(BlockDriverState *bs, uint64_t offset,
80
uint64_t bytes, QEMUIOVector *qiov, int flags)
81
{
82
BDRVQuorumState *s = bs->opaque;
83
- QuorumAIOCB *acb = quorum_aio_get(bs, qiov, offset, bytes);
84
+ QuorumAIOCB *acb = quorum_aio_get(bs, qiov, offset, bytes, flags);
85
int i, ret;
86
87
for (i = 0; i < s->num_children; i++) {
88
@@ -XXX,XX +XXX,XX @@ static int quorum_open(BlockDriverState *bs, QDict *options, int flags,
89
}
34
}
90
s->next_child_index = s->num_children;
91
92
+ bs->supported_write_flags = BDRV_REQ_WRITE_UNCHANGED;
93
+
94
g_free(opened);
95
goto exit;
96
35
97
--
36
--
98
2.13.6
37
2.13.6
99
38
100
39
diff view generated by jsdifflib
1
From: Eric Blake <eblake@redhat.com>
1
Drain requests are propagated to child nodes, parent nodes and directly
2
to the AioContext. The order in which this happened was different
3
between all combinations of drain/drain_all and begin/end.
2
4
3
We are gradually moving away from sector-based interfaces, towards
5
The correct order is to keep children only drained when their parents
4
byte-based. Add new sector-based aio callbacks for read and write,
6
are also drained. This means that at the start of a drained section, the
5
to match the fact that bdrv_aio_pdiscard is already byte-based.
7
AioContext needs to be drained first, the parents second and only then
8
the children. The correct order for the end of a drained section is the
9
opposite.
6
10
7
Ideally, drivers should be converted to use coroutine callbacks
11
This patch changes the three other functions to follow the example of
8
rather than aio; but that is not quite as trivial (and if we were
12
bdrv_drained_begin(), which is the only one that got it right.
9
to do that conversion, the null-aio driver would disappear), so for
10
the short term, converting the signature but keeping things with
11
aio is easier. However, we CAN declare that a driver that uses
12
the byte-based aio interfaces now defaults to byte-based
13
operations, and must explicitly provide a refresh_limits override
14
to stick with larger alignments (making the alignment issues more
15
obvious directly in the drivers touched in the next few patches).
16
13
17
Once all drivers are converted, the sector-based aio callbacks will
14
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
18
be removed; in the meantime, a FIXME comment is added due to a
15
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
19
slight inefficiency that will be touched up as part of that later
16
---
20
cleanup.
17
block/io.c | 12 ++++++++----
18
1 file changed, 8 insertions(+), 4 deletions(-)
21
19
22
Simplify some instances of 'bs->drv' into 'drv' while touching this,
23
since the local variable already exists to reduce typing.
24
25
Signed-off-by: Eric Blake <eblake@redhat.com>
26
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
27
---
28
include/block/block_int.h | 6 ++++++
29
block/io.c | 38 +++++++++++++++++++++++++++++---------
30
2 files changed, 35 insertions(+), 9 deletions(-)
31
32
diff --git a/include/block/block_int.h b/include/block/block_int.h
33
index XXXXXXX..XXXXXXX 100644
34
--- a/include/block/block_int.h
35
+++ b/include/block/block_int.h
36
@@ -XXX,XX +XXX,XX @@ struct BlockDriver {
37
BlockAIOCB *(*bdrv_aio_readv)(BlockDriverState *bs,
38
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
39
BlockCompletionFunc *cb, void *opaque);
40
+ BlockAIOCB *(*bdrv_aio_preadv)(BlockDriverState *bs,
41
+ uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags,
42
+ BlockCompletionFunc *cb, void *opaque);
43
BlockAIOCB *(*bdrv_aio_writev)(BlockDriverState *bs,
44
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
45
BlockCompletionFunc *cb, void *opaque);
46
+ BlockAIOCB *(*bdrv_aio_pwritev)(BlockDriverState *bs,
47
+ uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags,
48
+ BlockCompletionFunc *cb, void *opaque);
49
BlockAIOCB *(*bdrv_aio_flush)(BlockDriverState *bs,
50
BlockCompletionFunc *cb, void *opaque);
51
BlockAIOCB *(*bdrv_aio_pdiscard)(BlockDriverState *bs,
52
diff --git a/block/io.c b/block/io.c
20
diff --git a/block/io.c b/block/io.c
53
index XXXXXXX..XXXXXXX 100644
21
index XXXXXXX..XXXXXXX 100644
54
--- a/block/io.c
22
--- a/block/io.c
55
+++ b/block/io.c
23
+++ b/block/io.c
56
@@ -XXX,XX +XXX,XX @@ void bdrv_refresh_limits(BlockDriverState *bs, Error **errp)
24
@@ -XXX,XX +XXX,XX @@ void bdrv_drained_begin(BlockDriverState *bs)
25
return;
57
}
26
}
58
27
59
/* Default alignment based on whether driver has byte interface */
28
+ /* Stop things in parent-to-child order */
60
- bs->bl.request_alignment = drv->bdrv_co_preadv ? 1 : 512;
29
if (atomic_fetch_inc(&bs->quiesce_counter) == 0) {
61
+ bs->bl.request_alignment = (drv->bdrv_co_preadv ||
30
aio_disable_external(bdrv_get_aio_context(bs));
62
+ drv->bdrv_aio_preadv) ? 1 : 512;
31
bdrv_parent_drained_begin(bs);
63
32
@@ -XXX,XX +XXX,XX @@ void bdrv_drained_end(BlockDriverState *bs)
64
/* Take some limits from the children as a default */
33
return;
65
if (bs->file) {
66
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn bdrv_driver_preadv(BlockDriverState *bs,
67
return drv->bdrv_co_preadv(bs, offset, bytes, qiov, flags);
68
}
34
}
69
35
70
+ /* FIXME - no need to calculate these if .bdrv_aio_preadv exists */
36
- bdrv_parent_drained_end(bs);
71
sector_num = offset >> BDRV_SECTOR_BITS;
37
+ /* Re-enable things in child-to-parent order */
72
nb_sectors = bytes >> BDRV_SECTOR_BITS;
38
bdrv_drain_invoke(bs, false);
73
39
+ bdrv_parent_drained_end(bs);
74
- assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
40
aio_enable_external(bdrv_get_aio_context(bs));
75
- assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
41
}
76
- assert((bytes >> BDRV_SECTOR_BITS) <= BDRV_REQUEST_MAX_SECTORS);
42
77
+ if (!drv->bdrv_aio_preadv) {
43
@@ -XXX,XX +XXX,XX @@ void bdrv_drain_all_begin(void)
78
+ assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
44
for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
79
+ assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
45
AioContext *aio_context = bdrv_get_aio_context(bs);
80
+ assert((bytes >> BDRV_SECTOR_BITS) <= BDRV_REQUEST_MAX_SECTORS);
46
81
+ }
47
+ /* Stop things in parent-to-child order */
82
48
aio_context_acquire(aio_context);
83
if (drv->bdrv_co_readv) {
49
- bdrv_parent_drained_begin(bs);
84
return drv->bdrv_co_readv(bs, sector_num, nb_sectors, qiov);
50
aio_disable_external(aio_context);
85
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn bdrv_driver_preadv(BlockDriverState *bs,
51
+ bdrv_parent_drained_begin(bs);
86
.coroutine = qemu_coroutine_self(),
52
bdrv_drain_invoke(bs, true);
87
};
53
aio_context_release(aio_context);
88
54
89
- acb = bs->drv->bdrv_aio_readv(bs, sector_num, qiov, nb_sectors,
55
@@ -XXX,XX +XXX,XX @@ void bdrv_drain_all_end(void)
90
+ if (drv->bdrv_aio_preadv) {
56
for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
91
+ acb = drv->bdrv_aio_preadv(bs, offset, bytes, qiov, flags,
57
AioContext *aio_context = bdrv_get_aio_context(bs);
92
+ bdrv_co_io_em_complete, &co);
58
93
+ } else {
59
+ /* Re-enable things in child-to-parent order */
94
+ acb = drv->bdrv_aio_readv(bs, sector_num, qiov, nb_sectors,
60
aio_context_acquire(aio_context);
95
bdrv_co_io_em_complete, &co);
61
- aio_enable_external(aio_context);
96
+ }
62
- bdrv_parent_drained_end(bs);
97
if (acb == NULL) {
63
bdrv_drain_invoke(bs, false);
98
return -EIO;
64
+ bdrv_parent_drained_end(bs);
99
} else {
65
+ aio_enable_external(aio_context);
100
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn bdrv_driver_pwritev(BlockDriverState *bs,
66
aio_context_release(aio_context);
101
goto emulate_flags;
102
}
67
}
103
68
104
+ /* FIXME - no need to calculate these if .bdrv_aio_pwritev exists */
105
sector_num = offset >> BDRV_SECTOR_BITS;
106
nb_sectors = bytes >> BDRV_SECTOR_BITS;
107
108
- assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
109
- assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
110
- assert((bytes >> BDRV_SECTOR_BITS) <= BDRV_REQUEST_MAX_SECTORS);
111
+ if (!drv->bdrv_aio_pwritev) {
112
+ assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
113
+ assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
114
+ assert((bytes >> BDRV_SECTOR_BITS) <= BDRV_REQUEST_MAX_SECTORS);
115
+ }
116
117
if (drv->bdrv_co_writev_flags) {
118
ret = drv->bdrv_co_writev_flags(bs, sector_num, nb_sectors, qiov,
119
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn bdrv_driver_pwritev(BlockDriverState *bs,
120
.coroutine = qemu_coroutine_self(),
121
};
122
123
- acb = bs->drv->bdrv_aio_writev(bs, sector_num, qiov, nb_sectors,
124
+ if (drv->bdrv_aio_pwritev) {
125
+ acb = drv->bdrv_aio_pwritev(bs, offset, bytes, qiov,
126
+ flags & bs->supported_write_flags,
127
+ bdrv_co_io_em_complete, &co);
128
+ flags &= ~bs->supported_write_flags;
129
+ } else {
130
+ assert(!bs->supported_write_flags);
131
+ acb = drv->bdrv_aio_writev(bs, sector_num, qiov, nb_sectors,
132
bdrv_co_io_em_complete, &co);
133
+ }
134
if (acb == NULL) {
135
ret = -EIO;
136
} else {
137
--
69
--
138
2.13.6
70
2.13.6
139
71
140
72
diff view generated by jsdifflib
1
From: Alberto Garcia <berto@igalia.com>
1
Commit 15afd94a047 added code to acquire and release the AioContext in
2
qemuio_command(). This means that the lock is taken twice now in the
3
call path from hmp_qemu_io(). This causes BDRV_POLL_WHILE() to hang for
4
any requests issued to nodes in a non-mainloop AioContext.
2
5
3
The QMP version of this command can take a qdev ID since 7a9877a02635,
6
Dropping the first locking from hmp_qemu_io() fixes the problem.
4
but the HMP version is still using the deprecated block device name so
5
there's no way to refer to a block device added like this:
6
7
7
-blockdev node-name=disk0,driver=qcow2,file.driver=file,file.filename=hd.qcow2
8
-device virtio-blk-pci,id=virtio-blk-pci0,drive=disk0
9
10
This patch works around this problem by using the specified name as a
11
qdev ID if the block device name is not found.
12
13
Signed-off-by: Alberto Garcia <berto@igalia.com>
14
Reviewed-by: Eric Blake <eblake@redhat.com>
15
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
16
---
10
---
17
hmp.c | 14 ++++++++++++--
11
hmp.c | 6 ------
18
hmp-commands.hx | 3 ++-
12
1 file changed, 6 deletions(-)
19
2 files changed, 14 insertions(+), 3 deletions(-)
20
13
21
diff --git a/hmp.c b/hmp.c
14
diff --git a/hmp.c b/hmp.c
22
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
23
--- a/hmp.c
16
--- a/hmp.c
24
+++ b/hmp.c
17
+++ b/hmp.c
25
@@ -XXX,XX +XXX,XX @@ void hmp_change(Monitor *mon, const QDict *qdict)
18
@@ -XXX,XX +XXX,XX @@ void hmp_qemu_io(Monitor *mon, const QDict *qdict)
26
void hmp_block_set_io_throttle(Monitor *mon, const QDict *qdict)
27
{
19
{
20
BlockBackend *blk;
21
BlockBackend *local_blk = NULL;
22
- AioContext *aio_context;
23
const char* device = qdict_get_str(qdict, "device");
24
const char* command = qdict_get_str(qdict, "command");
28
Error *err = NULL;
25
Error *err = NULL;
29
+ char *device = (char *) qdict_get_str(qdict, "device");
26
@@ -XXX,XX +XXX,XX @@ void hmp_qemu_io(Monitor *mon, const QDict *qdict)
30
BlockIOThrottle throttle = {
27
}
31
- .has_device = true,
28
}
32
- .device = (char *) qdict_get_str(qdict, "device"),
29
33
.bps = qdict_get_int(qdict, "bps"),
30
- aio_context = blk_get_aio_context(blk);
34
.bps_rd = qdict_get_int(qdict, "bps_rd"),
31
- aio_context_acquire(aio_context);
35
.bps_wr = qdict_get_int(qdict, "bps_wr"),
32
-
36
@@ -XXX,XX +XXX,XX @@ void hmp_block_set_io_throttle(Monitor *mon, const QDict *qdict)
33
/*
37
.iops_wr = qdict_get_int(qdict, "iops_wr"),
34
* Notably absent: Proper permission management. This is sad, but it seems
38
};
35
* almost impossible to achieve without changing the semantics and thereby
39
36
@@ -XXX,XX +XXX,XX @@ void hmp_qemu_io(Monitor *mon, const QDict *qdict)
40
+ /* qmp_block_set_io_throttle has separate parameters for the
37
*/
41
+ * (deprecated) block device name and the qdev ID but the HMP
38
qemuio_command(blk, command);
42
+ * version has only one, so we must decide which one to pass. */
39
43
+ if (blk_by_name(device)) {
40
- aio_context_release(aio_context);
44
+ throttle.has_device = true;
41
-
45
+ throttle.device = device;
42
fail:
46
+ } else {
43
blk_unref(local_blk);
47
+ throttle.has_id = true;
48
+ throttle.id = device;
49
+ }
50
+
51
qmp_block_set_io_throttle(&throttle, &err);
52
hmp_handle_error(mon, &err);
44
hmp_handle_error(mon, &err);
53
}
54
diff --git a/hmp-commands.hx b/hmp-commands.hx
55
index XXXXXXX..XXXXXXX 100644
56
--- a/hmp-commands.hx
57
+++ b/hmp-commands.hx
58
@@ -XXX,XX +XXX,XX @@ ETEXI
59
STEXI
60
@item block_set_io_throttle @var{device} @var{bps} @var{bps_rd} @var{bps_wr} @var{iops} @var{iops_rd} @var{iops_wr}
61
@findex block_set_io_throttle
62
-Change I/O throttle limits for a block drive to @var{bps} @var{bps_rd} @var{bps_wr} @var{iops} @var{iops_rd} @var{iops_wr}
63
+Change I/O throttle limits for a block drive to @var{bps} @var{bps_rd} @var{bps_wr} @var{iops} @var{iops_rd} @var{iops_wr}.
64
+@var{device} can be a block device name, a qdev ID or a QOM path.
65
ETEXI
66
67
{
68
--
45
--
69
2.13.6
46
2.13.6
70
47
71
48
diff view generated by jsdifflib
1
From: Alberto Garcia <berto@igalia.com>
1
From: Edgar Kaziakhmedov <edgar.kaziakhmedov@virtuozzo.com>
2
2
3
The L2 and refcount caches have default sizes that can be overridden
3
Since bdrv_co_preadv does all neccessary checks including
4
using the l2-cache-size and refcount-cache-size (an additional
4
reading after the end of the backing file, avoid duplication
5
parameter named cache-size sets the combined size of both caches).
5
of verification before bdrv_co_preadv call.
6
6
7
Unless forced by one of the aforementioned parameters, QEMU will set
7
Signed-off-by: Edgar Kaziakhmedov <edgar.kaziakhmedov@virtuozzo.com>
8
the unspecified sizes so that the L2 cache is 4 times larger than the
8
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
9
refcount cache.
10
11
This is based on the premise that the refcount metadata needs to be
12
only a fourth of the L2 metadata to cover the same amount of disk
13
space. This is incorrect for two reasons:
14
15
a) The amount of disk covered by an L2 table depends solely on the
16
cluster size, but in the case of a refcount block it depends on
17
the cluster size *and* the width of each refcount entry.
18
The 4/1 ratio is only valid with 16-bit entries (the default).
19
20
b) When we talk about disk space and L2 tables we are talking about
21
guest space (L2 tables map guest clusters to host clusters),
22
whereas refcount blocks are used for host clusters (including
23
L1/L2 tables and the refcount blocks themselves). On a fully
24
populated (and uncompressed) qcow2 file, image size > virtual size
25
so there are more refcount entries than L2 entries.
26
27
Problem (a) could be fixed by adjusting the algorithm to take into
28
account the refcount entry width. Problem (b) could be fixed by
29
increasing a bit the refcount cache size to account for the clusters
30
used for qcow2 metadata.
31
32
However this patch takes a completely different approach and instead
33
of keeping a ratio between both cache sizes it assigns as much as
34
possible to the L2 cache and the remainder to the refcount cache.
35
36
The reason is that L2 tables are used for every single I/O request
37
from the guest and the effect of increasing the cache is significant
38
and clearly measurable. Refcount blocks are however only used for
39
cluster allocation and internal snapshots and in practice are accessed
40
sequentially in most cases, so the effect of increasing the cache is
41
negligible (even when doing random writes from the guest).
42
43
So, make the refcount cache as small as possible unless the user
44
explicitly asks for a larger one.
45
46
Signed-off-by: Alberto Garcia <berto@igalia.com>
47
Reviewed-by: Eric Blake <eblake@redhat.com>
9
Reviewed-by: Eric Blake <eblake@redhat.com>
48
Reviewed-by: Max Reitz <mreitz@redhat.com>
10
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
49
Message-id: 9695182c2eb11b77cb319689a1ebaa4e7c9d6591.1523968389.git.berto@igalia.com
50
Signed-off-by: Max Reitz <mreitz@redhat.com>
51
---
11
---
52
block/qcow2.h | 4 ----
12
block/qcow2.h | 3 ---
53
block/qcow2.c | 31 +++++++++++++++++++------------
13
block/qcow2.c | 51 ++++++++-------------------------------------------
54
tests/qemu-iotests/137.out | 2 +-
14
2 files changed, 8 insertions(+), 46 deletions(-)
55
3 files changed, 20 insertions(+), 17 deletions(-)
56
15
57
diff --git a/block/qcow2.h b/block/qcow2.h
16
diff --git a/block/qcow2.h b/block/qcow2.h
58
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
59
--- a/block/qcow2.h
18
--- a/block/qcow2.h
60
+++ b/block/qcow2.h
19
+++ b/block/qcow2.h
61
@@ -XXX,XX +XXX,XX @@
20
@@ -XXX,XX +XXX,XX @@ uint32_t offset_to_reftable_index(BDRVQcow2State *s, uint64_t offset)
62
#define DEFAULT_L2_CACHE_CLUSTERS 8 /* clusters */
21
}
63
#define DEFAULT_L2_CACHE_BYTE_SIZE 1048576 /* bytes */
22
64
23
/* qcow2.c functions */
65
-/* The refblock cache needs only a fourth of the L2 cache size to cover as many
24
-int qcow2_backing_read1(BlockDriverState *bs, QEMUIOVector *qiov,
66
- * clusters */
25
- int64_t sector_num, int nb_sectors);
67
-#define DEFAULT_L2_REFCOUNT_SIZE_RATIO 4
68
-
26
-
69
#define DEFAULT_CLUSTER_SIZE 65536
27
int64_t qcow2_refcount_metadata_size(int64_t clusters, size_t cluster_size,
70
28
int refcount_order, bool generous_increase,
71
29
uint64_t *refblock_count);
72
diff --git a/block/qcow2.c b/block/qcow2.c
30
diff --git a/block/qcow2.c b/block/qcow2.c
73
index XXXXXXX..XXXXXXX 100644
31
index XXXXXXX..XXXXXXX 100644
74
--- a/block/qcow2.c
32
--- a/block/qcow2.c
75
+++ b/block/qcow2.c
33
+++ b/block/qcow2.c
76
@@ -XXX,XX +XXX,XX @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts *opts,
34
@@ -XXX,XX +XXX,XX @@ static int64_t coroutine_fn qcow2_co_get_block_status(BlockDriverState *bs,
77
} else if (refcount_cache_size_set) {
35
return status;
78
*l2_cache_size = combined_cache_size - *refcount_cache_size;
36
}
79
} else {
37
80
- *refcount_cache_size = combined_cache_size
38
-/* handle reading after the end of the backing file */
81
- / (DEFAULT_L2_REFCOUNT_SIZE_RATIO + 1);
39
-int qcow2_backing_read1(BlockDriverState *bs, QEMUIOVector *qiov,
82
- *l2_cache_size = combined_cache_size - *refcount_cache_size;
40
- int64_t offset, int bytes)
83
+ uint64_t virtual_disk_size = bs->total_sectors * BDRV_SECTOR_SIZE;
41
-{
84
+ uint64_t max_l2_cache = virtual_disk_size / (s->cluster_size / 8);
42
- uint64_t bs_size = bs->total_sectors * BDRV_SECTOR_SIZE;
85
+ uint64_t min_refcount_cache =
43
- int n1;
86
+ (uint64_t) MIN_REFCOUNT_CACHE_SIZE * s->cluster_size;
44
-
87
+
45
- if ((offset + bytes) <= bs_size) {
88
+ /* Assign as much memory as possible to the L2 cache, and
46
- return bytes;
89
+ * use the remainder for the refcount cache */
47
- }
90
+ if (combined_cache_size >= max_l2_cache + min_refcount_cache) {
48
-
91
+ *l2_cache_size = max_l2_cache;
49
- if (offset >= bs_size) {
92
+ *refcount_cache_size = combined_cache_size - *l2_cache_size;
50
- n1 = 0;
93
+ } else {
51
- } else {
94
+ *refcount_cache_size =
52
- n1 = bs_size - offset;
95
+ MIN(combined_cache_size, min_refcount_cache);
53
- }
96
+ *l2_cache_size = combined_cache_size - *refcount_cache_size;
54
-
97
+ }
55
- qemu_iovec_memset(qiov, n1, 0, bytes - n1);
98
}
56
-
99
} else {
57
- return n1;
100
- if (!l2_cache_size_set && !refcount_cache_size_set) {
58
-}
101
+ if (!l2_cache_size_set) {
59
-
102
*l2_cache_size = MAX(DEFAULT_L2_CACHE_BYTE_SIZE,
60
static coroutine_fn int qcow2_co_preadv(BlockDriverState *bs, uint64_t offset,
103
(uint64_t)DEFAULT_L2_CACHE_CLUSTERS
61
uint64_t bytes, QEMUIOVector *qiov,
104
* s->cluster_size);
62
int flags)
105
- *refcount_cache_size = *l2_cache_size
63
{
106
- / DEFAULT_L2_REFCOUNT_SIZE_RATIO;
64
BDRVQcow2State *s = bs->opaque;
107
- } else if (!l2_cache_size_set) {
65
- int offset_in_cluster, n1;
108
- *l2_cache_size = *refcount_cache_size
66
+ int offset_in_cluster;
109
- * DEFAULT_L2_REFCOUNT_SIZE_RATIO;
67
int ret;
110
- } else if (!refcount_cache_size_set) {
68
unsigned int cur_bytes; /* number of bytes in current iteration */
111
- *refcount_cache_size = *l2_cache_size
69
uint64_t cluster_offset = 0;
112
- / DEFAULT_L2_REFCOUNT_SIZE_RATIO;
70
@@ -XXX,XX +XXX,XX @@ static coroutine_fn int qcow2_co_preadv(BlockDriverState *bs, uint64_t offset,
113
+ }
71
case QCOW2_CLUSTER_UNALLOCATED:
114
+ if (!refcount_cache_size_set) {
72
115
+ *refcount_cache_size = MIN_REFCOUNT_CACHE_SIZE * s->cluster_size;
73
if (bs->backing) {
116
}
74
- /* read from the base image */
117
}
75
- n1 = qcow2_backing_read1(bs->backing->bs, &hd_qiov,
118
76
- offset, cur_bytes);
119
diff --git a/tests/qemu-iotests/137.out b/tests/qemu-iotests/137.out
77
- if (n1 > 0) {
120
index XXXXXXX..XXXXXXX 100644
78
- QEMUIOVector local_qiov;
121
--- a/tests/qemu-iotests/137.out
79
-
122
+++ b/tests/qemu-iotests/137.out
80
- qemu_iovec_init(&local_qiov, hd_qiov.niov);
123
@@ -XXX,XX +XXX,XX @@ refcount-cache-size may not exceed cache-size
81
- qemu_iovec_concat(&local_qiov, &hd_qiov, 0, n1);
124
L2 cache size too big
82
-
125
L2 cache entry size must be a power of two between 512 and the cluster size (65536)
83
- BLKDBG_EVENT(bs->file, BLKDBG_READ_BACKING_AIO);
126
L2 cache entry size must be a power of two between 512 and the cluster size (65536)
84
- qemu_co_mutex_unlock(&s->lock);
127
-L2 cache size too big
85
- ret = bdrv_co_preadv(bs->backing, offset, n1,
128
+Refcount cache size too big
86
- &local_qiov, 0);
129
Conflicting values for qcow2 options 'overlap-check' ('constant') and 'overlap-check.template' ('all')
87
- qemu_co_mutex_lock(&s->lock);
130
Unsupported value 'blubb' for qcow2 option 'overlap-check'. Allowed are any of the following: none, constant, cached, all
88
-
131
Unsupported value 'blubb' for qcow2 option 'overlap-check'. Allowed are any of the following: none, constant, cached, all
89
- qemu_iovec_destroy(&local_qiov);
90
-
91
- if (ret < 0) {
92
- goto fail;
93
- }
94
+ BLKDBG_EVENT(bs->file, BLKDBG_READ_BACKING_AIO);
95
+ qemu_co_mutex_unlock(&s->lock);
96
+ ret = bdrv_co_preadv(bs->backing, offset, cur_bytes,
97
+ &hd_qiov, 0);
98
+ qemu_co_mutex_lock(&s->lock);
99
+ if (ret < 0) {
100
+ goto fail;
101
}
102
} else {
103
/* Note: in this case, no need to wait */
132
--
104
--
133
2.13.6
105
2.13.6
134
106
135
107
diff view generated by jsdifflib
1
From: John Snow <jsnow@redhat.com>
1
Removing a quorum child node with x-blockdev-change results in a quorum
2
driver state that cannot be recreated with create options because it
3
would require a list with gaps. This causes trouble in at least
4
.bdrv_refresh_filename().
2
5
3
When we've reached the concluded state, we need to expose the error
6
Document this problem so that we won't accidentally mark the command
4
state if applicable. Add the new field.
7
stable without having addressed it.
5
8
6
This should be sufficient for determining if a job completed
9
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
7
successfully or not after concluding; if we want to discriminate
8
based on how it failed more mechanically, we can always add an
9
explicit return code enumeration later.
10
11
I didn't bother to make it only show up if we are in the concluded
12
state; I don't think it's necessary.
13
14
Cc: qemu-stable@nongnu.org
15
Signed-off-by: John Snow <jsnow@redhat.com>
16
Reviewed-by: Eric Blake <eblake@redhat.com>
17
Reviewed-by: Alberto Garcia <berto@igalia.com>
10
Reviewed-by: Alberto Garcia <berto@igalia.com>
18
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
19
---
11
---
20
qapi/block-core.json | 6 +++++-
12
qapi/block-core.json | 4 ++++
21
blockjob.c | 2 ++
13
1 file changed, 4 insertions(+)
22
2 files changed, 7 insertions(+), 1 deletion(-)
23
14
24
diff --git a/qapi/block-core.json b/qapi/block-core.json
15
diff --git a/qapi/block-core.json b/qapi/block-core.json
25
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
26
--- a/qapi/block-core.json
17
--- a/qapi/block-core.json
27
+++ b/qapi/block-core.json
18
+++ b/qapi/block-core.json
28
@@ -XXX,XX +XXX,XX @@
19
@@ -XXX,XX +XXX,XX @@
29
# @auto-dismiss: Job will dismiss itself when CONCLUDED, moving to the NULL
20
# does not support all kinds of operations, all kinds of children, nor
30
# state and disappearing from the query list. (since 2.12)
21
# all block drivers.
31
#
22
#
32
+# @error: Error information if the job did not complete successfully.
23
+# FIXME Removing children from a quorum node means introducing gaps in the
33
+# Not set if the job completed successfully. (since 2.12.1)
24
+# child indices. This cannot be represented in the 'children' list of
25
+# BlockdevOptionsQuorum, as returned by .bdrv_refresh_filename().
34
+#
26
+#
35
# Since: 1.1
27
# Warning: The data in a new quorum child MUST be consistent with that of
36
##
28
# the rest of the array.
37
{ 'struct': 'BlockJobInfo',
29
#
38
@@ -XXX,XX +XXX,XX @@
39
'offset': 'int', 'busy': 'bool', 'paused': 'bool', 'speed': 'int',
40
'io-status': 'BlockDeviceIoStatus', 'ready': 'bool',
41
'status': 'BlockJobStatus',
42
- 'auto-finalize': 'bool', 'auto-dismiss': 'bool' } }
43
+ 'auto-finalize': 'bool', 'auto-dismiss': 'bool',
44
+ '*error': 'str' } }
45
46
##
47
# @query-block-jobs:
48
diff --git a/blockjob.c b/blockjob.c
49
index XXXXXXX..XXXXXXX 100644
50
--- a/blockjob.c
51
+++ b/blockjob.c
52
@@ -XXX,XX +XXX,XX @@ BlockJobInfo *block_job_query(BlockJob *job, Error **errp)
53
info->status = job->status;
54
info->auto_finalize = job->auto_finalize;
55
info->auto_dismiss = job->auto_dismiss;
56
+ info->has_error = job->ret != 0;
57
+ info->error = job->ret ? g_strdup(strerror(-job->ret)) : NULL;
58
return info;
59
}
60
61
--
30
--
62
2.13.6
31
2.13.6
63
32
64
33
diff view generated by jsdifflib
1
From: Alberto Garcia <berto@igalia.com>
1
From: Doug Gale <doug16k@gmail.com>
2
2
3
Compressed clusters are not supposed to have the COPIED bit set.
3
Add trace output for commands, errors, and undefined behavior.
4
"qemu-img check" detects that and prints an error message reporting
4
Add guest error log output for undefined behavior.
5
the number of the affected host cluster. This doesn't make much sense
5
Report invalid undefined accesses to MMIO.
6
because compressed clusters are not aligned to host clusters, so it
6
Annotate unlikely error checks with unlikely.
7
would be better to report the offset instead. Plus, the calculation is
8
wrong and it uses the raw L2 entry as if it was simply an offset.
9
7
10
This patch fixes the error message and reports the offset of the
8
Signed-off-by: Doug Gale <doug16k@gmail.com>
11
compressed cluster.
9
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
10
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
11
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
12
---
13
hw/block/nvme.c | 349 ++++++++++++++++++++++++++++++++++++++++++--------
14
hw/block/trace-events | 93 ++++++++++++++
15
2 files changed, 390 insertions(+), 52 deletions(-)
12
16
13
Signed-off-by: Alberto Garcia <berto@igalia.com>
17
diff --git a/hw/block/nvme.c b/hw/block/nvme.c
14
Message-id: 0f687957feb72e80c740403191a47e607c2463fe.1523376013.git.berto@igalia.com
15
Signed-off-by: Max Reitz <mreitz@redhat.com>
16
---
17
block/qcow2-refcount.c | 4 ++--
18
1 file changed, 2 insertions(+), 2 deletions(-)
19
20
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
21
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
22
--- a/block/qcow2-refcount.c
19
--- a/hw/block/nvme.c
23
+++ b/block/qcow2-refcount.c
20
+++ b/hw/block/nvme.c
24
@@ -XXX,XX +XXX,XX @@ static int check_refcounts_l2(BlockDriverState *bs, BdrvCheckResult *res,
21
@@ -XXX,XX +XXX,XX @@
25
case QCOW2_CLUSTER_COMPRESSED:
22
#include "qapi/visitor.h"
26
/* Compressed clusters don't have QCOW_OFLAG_COPIED */
23
#include "sysemu/block-backend.h"
27
if (l2_entry & QCOW_OFLAG_COPIED) {
24
28
- fprintf(stderr, "ERROR: cluster %" PRId64 ": "
25
+#include "qemu/log.h"
29
+ fprintf(stderr, "ERROR: coffset=0x%" PRIx64 ": "
26
+#include "trace.h"
30
"copied flag must never be set for compressed "
27
#include "nvme.h"
31
- "clusters\n", l2_entry >> s->cluster_bits);
28
32
+ "clusters\n", l2_entry & s->cluster_offset_mask);
29
+#define NVME_GUEST_ERR(trace, fmt, ...) \
33
l2_entry &= ~QCOW_OFLAG_COPIED;
30
+ do { \
34
res->corruptions++;
31
+ (trace_##trace)(__VA_ARGS__); \
32
+ qemu_log_mask(LOG_GUEST_ERROR, #trace \
33
+ " in %s: " fmt "\n", __func__, ## __VA_ARGS__); \
34
+ } while (0)
35
+
36
static void nvme_process_sq(void *opaque);
37
38
static void nvme_addr_read(NvmeCtrl *n, hwaddr addr, void *buf, int size)
39
@@ -XXX,XX +XXX,XX @@ static void nvme_isr_notify(NvmeCtrl *n, NvmeCQueue *cq)
40
{
41
if (cq->irq_enabled) {
42
if (msix_enabled(&(n->parent_obj))) {
43
+ trace_nvme_irq_msix(cq->vector);
44
msix_notify(&(n->parent_obj), cq->vector);
45
} else {
46
+ trace_nvme_irq_pin();
47
pci_irq_pulse(&n->parent_obj);
48
}
49
+ } else {
50
+ trace_nvme_irq_masked();
51
}
52
}
53
54
@@ -XXX,XX +XXX,XX @@ static uint16_t nvme_map_prp(QEMUSGList *qsg, QEMUIOVector *iov, uint64_t prp1,
55
trans_len = MIN(len, trans_len);
56
int num_prps = (len >> n->page_bits) + 1;
57
58
- if (!prp1) {
59
+ if (unlikely(!prp1)) {
60
+ trace_nvme_err_invalid_prp();
61
return NVME_INVALID_FIELD | NVME_DNR;
62
} else if (n->cmbsz && prp1 >= n->ctrl_mem.addr &&
63
prp1 < n->ctrl_mem.addr + int128_get64(n->ctrl_mem.size)) {
64
@@ -XXX,XX +XXX,XX @@ static uint16_t nvme_map_prp(QEMUSGList *qsg, QEMUIOVector *iov, uint64_t prp1,
65
}
66
len -= trans_len;
67
if (len) {
68
- if (!prp2) {
69
+ if (unlikely(!prp2)) {
70
+ trace_nvme_err_invalid_prp2_missing();
71
goto unmap;
72
}
73
if (len > n->page_size) {
74
@@ -XXX,XX +XXX,XX @@ static uint16_t nvme_map_prp(QEMUSGList *qsg, QEMUIOVector *iov, uint64_t prp1,
75
uint64_t prp_ent = le64_to_cpu(prp_list[i]);
76
77
if (i == n->max_prp_ents - 1 && len > n->page_size) {
78
- if (!prp_ent || prp_ent & (n->page_size - 1)) {
79
+ if (unlikely(!prp_ent || prp_ent & (n->page_size - 1))) {
80
+ trace_nvme_err_invalid_prplist_ent(prp_ent);
81
goto unmap;
82
}
83
84
@@ -XXX,XX +XXX,XX @@ static uint16_t nvme_map_prp(QEMUSGList *qsg, QEMUIOVector *iov, uint64_t prp1,
85
prp_ent = le64_to_cpu(prp_list[i]);
86
}
87
88
- if (!prp_ent || prp_ent & (n->page_size - 1)) {
89
+ if (unlikely(!prp_ent || prp_ent & (n->page_size - 1))) {
90
+ trace_nvme_err_invalid_prplist_ent(prp_ent);
91
goto unmap;
92
}
93
94
@@ -XXX,XX +XXX,XX @@ static uint16_t nvme_map_prp(QEMUSGList *qsg, QEMUIOVector *iov, uint64_t prp1,
95
i++;
35
}
96
}
97
} else {
98
- if (prp2 & (n->page_size - 1)) {
99
+ if (unlikely(prp2 & (n->page_size - 1))) {
100
+ trace_nvme_err_invalid_prp2_align(prp2);
101
goto unmap;
102
}
103
if (qsg->nsg) {
104
@@ -XXX,XX +XXX,XX @@ static uint16_t nvme_dma_read_prp(NvmeCtrl *n, uint8_t *ptr, uint32_t len,
105
QEMUIOVector iov;
106
uint16_t status = NVME_SUCCESS;
107
108
+ trace_nvme_dma_read(prp1, prp2);
109
+
110
if (nvme_map_prp(&qsg, &iov, prp1, prp2, len, n)) {
111
return NVME_INVALID_FIELD | NVME_DNR;
112
}
113
if (qsg.nsg > 0) {
114
- if (dma_buf_read(ptr, len, &qsg)) {
115
+ if (unlikely(dma_buf_read(ptr, len, &qsg))) {
116
+ trace_nvme_err_invalid_dma();
117
status = NVME_INVALID_FIELD | NVME_DNR;
118
}
119
qemu_sglist_destroy(&qsg);
120
} else {
121
- if (qemu_iovec_to_buf(&iov, 0, ptr, len) != len) {
122
+ if (unlikely(qemu_iovec_to_buf(&iov, 0, ptr, len) != len)) {
123
+ trace_nvme_err_invalid_dma();
124
status = NVME_INVALID_FIELD | NVME_DNR;
125
}
126
qemu_iovec_destroy(&iov);
127
@@ -XXX,XX +XXX,XX @@ static uint16_t nvme_write_zeros(NvmeCtrl *n, NvmeNamespace *ns, NvmeCmd *cmd,
128
uint64_t aio_slba = slba << (data_shift - BDRV_SECTOR_BITS);
129
uint32_t aio_nlb = nlb << (data_shift - BDRV_SECTOR_BITS);
130
131
- if (slba + nlb > ns->id_ns.nsze) {
132
+ if (unlikely(slba + nlb > ns->id_ns.nsze)) {
133
+ trace_nvme_err_invalid_lba_range(slba, nlb, ns->id_ns.nsze);
134
return NVME_LBA_RANGE | NVME_DNR;
135
}
136
137
@@ -XXX,XX +XXX,XX @@ static uint16_t nvme_rw(NvmeCtrl *n, NvmeNamespace *ns, NvmeCmd *cmd,
138
int is_write = rw->opcode == NVME_CMD_WRITE ? 1 : 0;
139
enum BlockAcctType acct = is_write ? BLOCK_ACCT_WRITE : BLOCK_ACCT_READ;
140
141
- if ((slba + nlb) > ns->id_ns.nsze) {
142
+ trace_nvme_rw(is_write ? "write" : "read", nlb, data_size, slba);
143
+
144
+ if (unlikely((slba + nlb) > ns->id_ns.nsze)) {
145
block_acct_invalid(blk_get_stats(n->conf.blk), acct);
146
+ trace_nvme_err_invalid_lba_range(slba, nlb, ns->id_ns.nsze);
147
return NVME_LBA_RANGE | NVME_DNR;
148
}
149
150
@@ -XXX,XX +XXX,XX @@ static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
151
NvmeNamespace *ns;
152
uint32_t nsid = le32_to_cpu(cmd->nsid);
153
154
- if (nsid == 0 || nsid > n->num_namespaces) {
155
+ if (unlikely(nsid == 0 || nsid > n->num_namespaces)) {
156
+ trace_nvme_err_invalid_ns(nsid, n->num_namespaces);
157
return NVME_INVALID_NSID | NVME_DNR;
158
}
159
160
@@ -XXX,XX +XXX,XX @@ static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
161
case NVME_CMD_READ:
162
return nvme_rw(n, ns, cmd, req);
163
default:
164
+ trace_nvme_err_invalid_opc(cmd->opcode);
165
return NVME_INVALID_OPCODE | NVME_DNR;
166
}
167
}
168
@@ -XXX,XX +XXX,XX @@ static uint16_t nvme_del_sq(NvmeCtrl *n, NvmeCmd *cmd)
169
NvmeCQueue *cq;
170
uint16_t qid = le16_to_cpu(c->qid);
171
172
- if (!qid || nvme_check_sqid(n, qid)) {
173
+ if (unlikely(!qid || nvme_check_sqid(n, qid))) {
174
+ trace_nvme_err_invalid_del_sq(qid);
175
return NVME_INVALID_QID | NVME_DNR;
176
}
177
178
+ trace_nvme_del_sq(qid);
179
+
180
sq = n->sq[qid];
181
while (!QTAILQ_EMPTY(&sq->out_req_list)) {
182
req = QTAILQ_FIRST(&sq->out_req_list);
183
@@ -XXX,XX +XXX,XX @@ static uint16_t nvme_create_sq(NvmeCtrl *n, NvmeCmd *cmd)
184
uint16_t qflags = le16_to_cpu(c->sq_flags);
185
uint64_t prp1 = le64_to_cpu(c->prp1);
186
187
- if (!cqid || nvme_check_cqid(n, cqid)) {
188
+ trace_nvme_create_sq(prp1, sqid, cqid, qsize, qflags);
189
+
190
+ if (unlikely(!cqid || nvme_check_cqid(n, cqid))) {
191
+ trace_nvme_err_invalid_create_sq_cqid(cqid);
192
return NVME_INVALID_CQID | NVME_DNR;
193
}
194
- if (!sqid || !nvme_check_sqid(n, sqid)) {
195
+ if (unlikely(!sqid || !nvme_check_sqid(n, sqid))) {
196
+ trace_nvme_err_invalid_create_sq_sqid(sqid);
197
return NVME_INVALID_QID | NVME_DNR;
198
}
199
- if (!qsize || qsize > NVME_CAP_MQES(n->bar.cap)) {
200
+ if (unlikely(!qsize || qsize > NVME_CAP_MQES(n->bar.cap))) {
201
+ trace_nvme_err_invalid_create_sq_size(qsize);
202
return NVME_MAX_QSIZE_EXCEEDED | NVME_DNR;
203
}
204
- if (!prp1 || prp1 & (n->page_size - 1)) {
205
+ if (unlikely(!prp1 || prp1 & (n->page_size - 1))) {
206
+ trace_nvme_err_invalid_create_sq_addr(prp1);
207
return NVME_INVALID_FIELD | NVME_DNR;
208
}
209
- if (!(NVME_SQ_FLAGS_PC(qflags))) {
210
+ if (unlikely(!(NVME_SQ_FLAGS_PC(qflags)))) {
211
+ trace_nvme_err_invalid_create_sq_qflags(NVME_SQ_FLAGS_PC(qflags));
212
return NVME_INVALID_FIELD | NVME_DNR;
213
}
214
sq = g_malloc0(sizeof(*sq));
215
@@ -XXX,XX +XXX,XX @@ static uint16_t nvme_del_cq(NvmeCtrl *n, NvmeCmd *cmd)
216
NvmeCQueue *cq;
217
uint16_t qid = le16_to_cpu(c->qid);
218
219
- if (!qid || nvme_check_cqid(n, qid)) {
220
+ if (unlikely(!qid || nvme_check_cqid(n, qid))) {
221
+ trace_nvme_err_invalid_del_cq_cqid(qid);
222
return NVME_INVALID_CQID | NVME_DNR;
223
}
224
225
cq = n->cq[qid];
226
- if (!QTAILQ_EMPTY(&cq->sq_list)) {
227
+ if (unlikely(!QTAILQ_EMPTY(&cq->sq_list))) {
228
+ trace_nvme_err_invalid_del_cq_notempty(qid);
229
return NVME_INVALID_QUEUE_DEL;
230
}
231
+ trace_nvme_del_cq(qid);
232
nvme_free_cq(cq, n);
233
return NVME_SUCCESS;
234
}
235
@@ -XXX,XX +XXX,XX @@ static uint16_t nvme_create_cq(NvmeCtrl *n, NvmeCmd *cmd)
236
uint16_t qflags = le16_to_cpu(c->cq_flags);
237
uint64_t prp1 = le64_to_cpu(c->prp1);
238
239
- if (!cqid || !nvme_check_cqid(n, cqid)) {
240
+ trace_nvme_create_cq(prp1, cqid, vector, qsize, qflags,
241
+ NVME_CQ_FLAGS_IEN(qflags) != 0);
242
+
243
+ if (unlikely(!cqid || !nvme_check_cqid(n, cqid))) {
244
+ trace_nvme_err_invalid_create_cq_cqid(cqid);
245
return NVME_INVALID_CQID | NVME_DNR;
246
}
247
- if (!qsize || qsize > NVME_CAP_MQES(n->bar.cap)) {
248
+ if (unlikely(!qsize || qsize > NVME_CAP_MQES(n->bar.cap))) {
249
+ trace_nvme_err_invalid_create_cq_size(qsize);
250
return NVME_MAX_QSIZE_EXCEEDED | NVME_DNR;
251
}
252
- if (!prp1) {
253
+ if (unlikely(!prp1)) {
254
+ trace_nvme_err_invalid_create_cq_addr(prp1);
255
return NVME_INVALID_FIELD | NVME_DNR;
256
}
257
- if (vector > n->num_queues) {
258
+ if (unlikely(vector > n->num_queues)) {
259
+ trace_nvme_err_invalid_create_cq_vector(vector);
260
return NVME_INVALID_IRQ_VECTOR | NVME_DNR;
261
}
262
- if (!(NVME_CQ_FLAGS_PC(qflags))) {
263
+ if (unlikely(!(NVME_CQ_FLAGS_PC(qflags)))) {
264
+ trace_nvme_err_invalid_create_cq_qflags(NVME_CQ_FLAGS_PC(qflags));
265
return NVME_INVALID_FIELD | NVME_DNR;
266
}
267
268
@@ -XXX,XX +XXX,XX @@ static uint16_t nvme_identify_ctrl(NvmeCtrl *n, NvmeIdentify *c)
269
uint64_t prp1 = le64_to_cpu(c->prp1);
270
uint64_t prp2 = le64_to_cpu(c->prp2);
271
272
+ trace_nvme_identify_ctrl();
273
+
274
return nvme_dma_read_prp(n, (uint8_t *)&n->id_ctrl, sizeof(n->id_ctrl),
275
prp1, prp2);
276
}
277
@@ -XXX,XX +XXX,XX @@ static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeIdentify *c)
278
uint64_t prp1 = le64_to_cpu(c->prp1);
279
uint64_t prp2 = le64_to_cpu(c->prp2);
280
281
- if (nsid == 0 || nsid > n->num_namespaces) {
282
+ trace_nvme_identify_ns(nsid);
283
+
284
+ if (unlikely(nsid == 0 || nsid > n->num_namespaces)) {
285
+ trace_nvme_err_invalid_ns(nsid, n->num_namespaces);
286
return NVME_INVALID_NSID | NVME_DNR;
287
}
288
289
ns = &n->namespaces[nsid - 1];
290
+
291
return nvme_dma_read_prp(n, (uint8_t *)&ns->id_ns, sizeof(ns->id_ns),
292
prp1, prp2);
293
}
294
@@ -XXX,XX +XXX,XX @@ static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeIdentify *c)
295
uint16_t ret;
296
int i, j = 0;
297
298
+ trace_nvme_identify_nslist(min_nsid);
299
+
300
list = g_malloc0(data_len);
301
for (i = 0; i < n->num_namespaces; i++) {
302
if (i < min_nsid) {
303
@@ -XXX,XX +XXX,XX @@ static uint16_t nvme_identify(NvmeCtrl *n, NvmeCmd *cmd)
304
case 0x02:
305
return nvme_identify_nslist(n, c);
306
default:
307
+ trace_nvme_err_invalid_identify_cns(le32_to_cpu(c->cns));
308
return NVME_INVALID_FIELD | NVME_DNR;
309
}
310
}
311
@@ -XXX,XX +XXX,XX @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
312
switch (dw10) {
313
case NVME_VOLATILE_WRITE_CACHE:
314
result = blk_enable_write_cache(n->conf.blk);
315
+ trace_nvme_getfeat_vwcache(result ? "enabled" : "disabled");
316
break;
317
case NVME_NUMBER_OF_QUEUES:
318
result = cpu_to_le32((n->num_queues - 2) | ((n->num_queues - 2) << 16));
319
+ trace_nvme_getfeat_numq(result);
320
break;
321
default:
322
+ trace_nvme_err_invalid_getfeat(dw10);
323
return NVME_INVALID_FIELD | NVME_DNR;
324
}
325
326
@@ -XXX,XX +XXX,XX @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
327
blk_set_enable_write_cache(n->conf.blk, dw11 & 1);
328
break;
329
case NVME_NUMBER_OF_QUEUES:
330
+ trace_nvme_setfeat_numq((dw11 & 0xFFFF) + 1,
331
+ ((dw11 >> 16) & 0xFFFF) + 1,
332
+ n->num_queues - 1, n->num_queues - 1);
333
req->cqe.result =
334
cpu_to_le32((n->num_queues - 2) | ((n->num_queues - 2) << 16));
335
break;
336
default:
337
+ trace_nvme_err_invalid_setfeat(dw10);
338
return NVME_INVALID_FIELD | NVME_DNR;
339
}
340
return NVME_SUCCESS;
341
@@ -XXX,XX +XXX,XX @@ static uint16_t nvme_admin_cmd(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
342
case NVME_ADM_CMD_GET_FEATURES:
343
return nvme_get_feature(n, cmd, req);
344
default:
345
+ trace_nvme_err_invalid_admin_opc(cmd->opcode);
346
return NVME_INVALID_OPCODE | NVME_DNR;
347
}
348
}
349
@@ -XXX,XX +XXX,XX @@ static int nvme_start_ctrl(NvmeCtrl *n)
350
uint32_t page_bits = NVME_CC_MPS(n->bar.cc) + 12;
351
uint32_t page_size = 1 << page_bits;
352
353
- if (n->cq[0] || n->sq[0] || !n->bar.asq || !n->bar.acq ||
354
- n->bar.asq & (page_size - 1) || n->bar.acq & (page_size - 1) ||
355
- NVME_CC_MPS(n->bar.cc) < NVME_CAP_MPSMIN(n->bar.cap) ||
356
- NVME_CC_MPS(n->bar.cc) > NVME_CAP_MPSMAX(n->bar.cap) ||
357
- NVME_CC_IOCQES(n->bar.cc) < NVME_CTRL_CQES_MIN(n->id_ctrl.cqes) ||
358
- NVME_CC_IOCQES(n->bar.cc) > NVME_CTRL_CQES_MAX(n->id_ctrl.cqes) ||
359
- NVME_CC_IOSQES(n->bar.cc) < NVME_CTRL_SQES_MIN(n->id_ctrl.sqes) ||
360
- NVME_CC_IOSQES(n->bar.cc) > NVME_CTRL_SQES_MAX(n->id_ctrl.sqes) ||
361
- !NVME_AQA_ASQS(n->bar.aqa) || !NVME_AQA_ACQS(n->bar.aqa)) {
362
+ if (unlikely(n->cq[0])) {
363
+ trace_nvme_err_startfail_cq();
364
+ return -1;
365
+ }
366
+ if (unlikely(n->sq[0])) {
367
+ trace_nvme_err_startfail_sq();
368
+ return -1;
369
+ }
370
+ if (unlikely(!n->bar.asq)) {
371
+ trace_nvme_err_startfail_nbarasq();
372
+ return -1;
373
+ }
374
+ if (unlikely(!n->bar.acq)) {
375
+ trace_nvme_err_startfail_nbaracq();
376
+ return -1;
377
+ }
378
+ if (unlikely(n->bar.asq & (page_size - 1))) {
379
+ trace_nvme_err_startfail_asq_misaligned(n->bar.asq);
380
+ return -1;
381
+ }
382
+ if (unlikely(n->bar.acq & (page_size - 1))) {
383
+ trace_nvme_err_startfail_acq_misaligned(n->bar.acq);
384
+ return -1;
385
+ }
386
+ if (unlikely(NVME_CC_MPS(n->bar.cc) <
387
+ NVME_CAP_MPSMIN(n->bar.cap))) {
388
+ trace_nvme_err_startfail_page_too_small(
389
+ NVME_CC_MPS(n->bar.cc),
390
+ NVME_CAP_MPSMIN(n->bar.cap));
391
+ return -1;
392
+ }
393
+ if (unlikely(NVME_CC_MPS(n->bar.cc) >
394
+ NVME_CAP_MPSMAX(n->bar.cap))) {
395
+ trace_nvme_err_startfail_page_too_large(
396
+ NVME_CC_MPS(n->bar.cc),
397
+ NVME_CAP_MPSMAX(n->bar.cap));
398
+ return -1;
399
+ }
400
+ if (unlikely(NVME_CC_IOCQES(n->bar.cc) <
401
+ NVME_CTRL_CQES_MIN(n->id_ctrl.cqes))) {
402
+ trace_nvme_err_startfail_cqent_too_small(
403
+ NVME_CC_IOCQES(n->bar.cc),
404
+ NVME_CTRL_CQES_MIN(n->bar.cap));
405
+ return -1;
406
+ }
407
+ if (unlikely(NVME_CC_IOCQES(n->bar.cc) >
408
+ NVME_CTRL_CQES_MAX(n->id_ctrl.cqes))) {
409
+ trace_nvme_err_startfail_cqent_too_large(
410
+ NVME_CC_IOCQES(n->bar.cc),
411
+ NVME_CTRL_CQES_MAX(n->bar.cap));
412
+ return -1;
413
+ }
414
+ if (unlikely(NVME_CC_IOSQES(n->bar.cc) <
415
+ NVME_CTRL_SQES_MIN(n->id_ctrl.sqes))) {
416
+ trace_nvme_err_startfail_sqent_too_small(
417
+ NVME_CC_IOSQES(n->bar.cc),
418
+ NVME_CTRL_SQES_MIN(n->bar.cap));
419
+ return -1;
420
+ }
421
+ if (unlikely(NVME_CC_IOSQES(n->bar.cc) >
422
+ NVME_CTRL_SQES_MAX(n->id_ctrl.sqes))) {
423
+ trace_nvme_err_startfail_sqent_too_large(
424
+ NVME_CC_IOSQES(n->bar.cc),
425
+ NVME_CTRL_SQES_MAX(n->bar.cap));
426
+ return -1;
427
+ }
428
+ if (unlikely(!NVME_AQA_ASQS(n->bar.aqa))) {
429
+ trace_nvme_err_startfail_asqent_sz_zero();
430
+ return -1;
431
+ }
432
+ if (unlikely(!NVME_AQA_ACQS(n->bar.aqa))) {
433
+ trace_nvme_err_startfail_acqent_sz_zero();
434
return -1;
435
}
436
437
@@ -XXX,XX +XXX,XX @@ static int nvme_start_ctrl(NvmeCtrl *n)
438
static void nvme_write_bar(NvmeCtrl *n, hwaddr offset, uint64_t data,
439
unsigned size)
440
{
441
+ if (unlikely(offset & (sizeof(uint32_t) - 1))) {
442
+ NVME_GUEST_ERR(nvme_ub_mmiowr_misaligned32,
443
+ "MMIO write not 32-bit aligned,"
444
+ " offset=0x%"PRIx64"", offset);
445
+ /* should be ignored, fall through for now */
446
+ }
447
+
448
+ if (unlikely(size < sizeof(uint32_t))) {
449
+ NVME_GUEST_ERR(nvme_ub_mmiowr_toosmall,
450
+ "MMIO write smaller than 32-bits,"
451
+ " offset=0x%"PRIx64", size=%u",
452
+ offset, size);
453
+ /* should be ignored, fall through for now */
454
+ }
455
+
456
switch (offset) {
457
- case 0xc:
458
+ case 0xc: /* INTMS */
459
+ if (unlikely(msix_enabled(&(n->parent_obj)))) {
460
+ NVME_GUEST_ERR(nvme_ub_mmiowr_intmask_with_msix,
461
+ "undefined access to interrupt mask set"
462
+ " when MSI-X is enabled");
463
+ /* should be ignored, fall through for now */
464
+ }
465
n->bar.intms |= data & 0xffffffff;
466
n->bar.intmc = n->bar.intms;
467
+ trace_nvme_mmio_intm_set(data & 0xffffffff,
468
+ n->bar.intmc);
469
break;
470
- case 0x10:
471
+ case 0x10: /* INTMC */
472
+ if (unlikely(msix_enabled(&(n->parent_obj)))) {
473
+ NVME_GUEST_ERR(nvme_ub_mmiowr_intmask_with_msix,
474
+ "undefined access to interrupt mask clr"
475
+ " when MSI-X is enabled");
476
+ /* should be ignored, fall through for now */
477
+ }
478
n->bar.intms &= ~(data & 0xffffffff);
479
n->bar.intmc = n->bar.intms;
480
+ trace_nvme_mmio_intm_clr(data & 0xffffffff,
481
+ n->bar.intmc);
482
break;
483
- case 0x14:
484
+ case 0x14: /* CC */
485
+ trace_nvme_mmio_cfg(data & 0xffffffff);
486
/* Windows first sends data, then sends enable bit */
487
if (!NVME_CC_EN(data) && !NVME_CC_EN(n->bar.cc) &&
488
!NVME_CC_SHN(data) && !NVME_CC_SHN(n->bar.cc))
489
@@ -XXX,XX +XXX,XX @@ static void nvme_write_bar(NvmeCtrl *n, hwaddr offset, uint64_t data,
490
491
if (NVME_CC_EN(data) && !NVME_CC_EN(n->bar.cc)) {
492
n->bar.cc = data;
493
- if (nvme_start_ctrl(n)) {
494
+ if (unlikely(nvme_start_ctrl(n))) {
495
+ trace_nvme_err_startfail();
496
n->bar.csts = NVME_CSTS_FAILED;
497
} else {
498
+ trace_nvme_mmio_start_success();
499
n->bar.csts = NVME_CSTS_READY;
500
}
501
} else if (!NVME_CC_EN(data) && NVME_CC_EN(n->bar.cc)) {
502
+ trace_nvme_mmio_stopped();
503
nvme_clear_ctrl(n);
504
n->bar.csts &= ~NVME_CSTS_READY;
505
}
506
if (NVME_CC_SHN(data) && !(NVME_CC_SHN(n->bar.cc))) {
507
- nvme_clear_ctrl(n);
508
- n->bar.cc = data;
509
- n->bar.csts |= NVME_CSTS_SHST_COMPLETE;
510
+ trace_nvme_mmio_shutdown_set();
511
+ nvme_clear_ctrl(n);
512
+ n->bar.cc = data;
513
+ n->bar.csts |= NVME_CSTS_SHST_COMPLETE;
514
} else if (!NVME_CC_SHN(data) && NVME_CC_SHN(n->bar.cc)) {
515
- n->bar.csts &= ~NVME_CSTS_SHST_COMPLETE;
516
- n->bar.cc = data;
517
+ trace_nvme_mmio_shutdown_cleared();
518
+ n->bar.csts &= ~NVME_CSTS_SHST_COMPLETE;
519
+ n->bar.cc = data;
520
+ }
521
+ break;
522
+ case 0x1C: /* CSTS */
523
+ if (data & (1 << 4)) {
524
+ NVME_GUEST_ERR(nvme_ub_mmiowr_ssreset_w1c_unsupported,
525
+ "attempted to W1C CSTS.NSSRO"
526
+ " but CAP.NSSRS is zero (not supported)");
527
+ } else if (data != 0) {
528
+ NVME_GUEST_ERR(nvme_ub_mmiowr_ro_csts,
529
+ "attempted to set a read only bit"
530
+ " of controller status");
531
+ }
532
+ break;
533
+ case 0x20: /* NSSR */
534
+ if (data == 0x4E564D65) {
535
+ trace_nvme_ub_mmiowr_ssreset_unsupported();
536
+ } else {
537
+ /* The spec says that writes of other values have no effect */
538
+ return;
539
}
540
break;
541
- case 0x24:
542
+ case 0x24: /* AQA */
543
n->bar.aqa = data & 0xffffffff;
544
+ trace_nvme_mmio_aqattr(data & 0xffffffff);
545
break;
546
- case 0x28:
547
+ case 0x28: /* ASQ */
548
n->bar.asq = data;
549
+ trace_nvme_mmio_asqaddr(data);
550
break;
551
- case 0x2c:
552
+ case 0x2c: /* ASQ hi */
553
n->bar.asq |= data << 32;
554
+ trace_nvme_mmio_asqaddr_hi(data, n->bar.asq);
555
break;
556
- case 0x30:
557
+ case 0x30: /* ACQ */
558
+ trace_nvme_mmio_acqaddr(data);
559
n->bar.acq = data;
560
break;
561
- case 0x34:
562
+ case 0x34: /* ACQ hi */
563
n->bar.acq |= data << 32;
564
+ trace_nvme_mmio_acqaddr_hi(data, n->bar.acq);
565
break;
566
+ case 0x38: /* CMBLOC */
567
+ NVME_GUEST_ERR(nvme_ub_mmiowr_cmbloc_reserved,
568
+ "invalid write to reserved CMBLOC"
569
+ " when CMBSZ is zero, ignored");
570
+ return;
571
+ case 0x3C: /* CMBSZ */
572
+ NVME_GUEST_ERR(nvme_ub_mmiowr_cmbsz_readonly,
573
+ "invalid write to read only CMBSZ, ignored");
574
+ return;
575
default:
576
+ NVME_GUEST_ERR(nvme_ub_mmiowr_invalid,
577
+ "invalid MMIO write,"
578
+ " offset=0x%"PRIx64", data=%"PRIx64"",
579
+ offset, data);
580
break;
581
}
582
}
583
@@ -XXX,XX +XXX,XX @@ static uint64_t nvme_mmio_read(void *opaque, hwaddr addr, unsigned size)
584
uint8_t *ptr = (uint8_t *)&n->bar;
585
uint64_t val = 0;
586
587
+ if (unlikely(addr & (sizeof(uint32_t) - 1))) {
588
+ NVME_GUEST_ERR(nvme_ub_mmiord_misaligned32,
589
+ "MMIO read not 32-bit aligned,"
590
+ " offset=0x%"PRIx64"", addr);
591
+ /* should RAZ, fall through for now */
592
+ } else if (unlikely(size < sizeof(uint32_t))) {
593
+ NVME_GUEST_ERR(nvme_ub_mmiord_toosmall,
594
+ "MMIO read smaller than 32-bits,"
595
+ " offset=0x%"PRIx64"", addr);
596
+ /* should RAZ, fall through for now */
597
+ }
598
+
599
if (addr < sizeof(n->bar)) {
600
memcpy(&val, ptr + addr, size);
601
+ } else {
602
+ NVME_GUEST_ERR(nvme_ub_mmiord_invalid_ofs,
603
+ "MMIO read beyond last register,"
604
+ " offset=0x%"PRIx64", returning 0", addr);
605
}
606
+
607
return val;
608
}
609
610
@@ -XXX,XX +XXX,XX @@ static void nvme_process_db(NvmeCtrl *n, hwaddr addr, int val)
611
{
612
uint32_t qid;
613
614
- if (addr & ((1 << 2) - 1)) {
615
+ if (unlikely(addr & ((1 << 2) - 1))) {
616
+ NVME_GUEST_ERR(nvme_ub_db_wr_misaligned,
617
+ "doorbell write not 32-bit aligned,"
618
+ " offset=0x%"PRIx64", ignoring", addr);
619
return;
620
}
621
622
if (((addr - 0x1000) >> 2) & 1) {
623
+ /* Completion queue doorbell write */
624
+
625
uint16_t new_head = val & 0xffff;
626
int start_sqs;
627
NvmeCQueue *cq;
628
629
qid = (addr - (0x1000 + (1 << 2))) >> 3;
630
- if (nvme_check_cqid(n, qid)) {
631
+ if (unlikely(nvme_check_cqid(n, qid))) {
632
+ NVME_GUEST_ERR(nvme_ub_db_wr_invalid_cq,
633
+ "completion queue doorbell write"
634
+ " for nonexistent queue,"
635
+ " sqid=%"PRIu32", ignoring", qid);
636
return;
637
}
638
639
cq = n->cq[qid];
640
- if (new_head >= cq->size) {
641
+ if (unlikely(new_head >= cq->size)) {
642
+ NVME_GUEST_ERR(nvme_ub_db_wr_invalid_cqhead,
643
+ "completion queue doorbell write value"
644
+ " beyond queue size, sqid=%"PRIu32","
645
+ " new_head=%"PRIu16", ignoring",
646
+ qid, new_head);
647
return;
648
}
649
650
@@ -XXX,XX +XXX,XX @@ static void nvme_process_db(NvmeCtrl *n, hwaddr addr, int val)
651
nvme_isr_notify(n, cq);
652
}
653
} else {
654
+ /* Submission queue doorbell write */
655
+
656
uint16_t new_tail = val & 0xffff;
657
NvmeSQueue *sq;
658
659
qid = (addr - 0x1000) >> 3;
660
- if (nvme_check_sqid(n, qid)) {
661
+ if (unlikely(nvme_check_sqid(n, qid))) {
662
+ NVME_GUEST_ERR(nvme_ub_db_wr_invalid_sq,
663
+ "submission queue doorbell write"
664
+ " for nonexistent queue,"
665
+ " sqid=%"PRIu32", ignoring", qid);
666
return;
667
}
668
669
sq = n->sq[qid];
670
- if (new_tail >= sq->size) {
671
+ if (unlikely(new_tail >= sq->size)) {
672
+ NVME_GUEST_ERR(nvme_ub_db_wr_invalid_sqtail,
673
+ "submission queue doorbell write value"
674
+ " beyond queue size, sqid=%"PRIu32","
675
+ " new_tail=%"PRIu16", ignoring",
676
+ qid, new_tail);
677
return;
678
}
679
680
diff --git a/hw/block/trace-events b/hw/block/trace-events
681
index XXXXXXX..XXXXXXX 100644
682
--- a/hw/block/trace-events
683
+++ b/hw/block/trace-events
684
@@ -XXX,XX +XXX,XX @@ virtio_blk_submit_multireq(void *vdev, void *mrb, int start, int num_reqs, uint6
685
hd_geometry_lchs_guess(void *blk, int cyls, int heads, int secs) "blk %p LCHS %d %d %d"
686
hd_geometry_guess(void *blk, uint32_t cyls, uint32_t heads, uint32_t secs, int trans) "blk %p CHS %u %u %u trans %d"
687
688
+# hw/block/nvme.c
689
+# nvme traces for successful events
690
+nvme_irq_msix(uint32_t vector) "raising MSI-X IRQ vector %u"
691
+nvme_irq_pin(void) "pulsing IRQ pin"
692
+nvme_irq_masked(void) "IRQ is masked"
693
+nvme_dma_read(uint64_t prp1, uint64_t prp2) "DMA read, prp1=0x%"PRIx64" prp2=0x%"PRIx64""
694
+nvme_rw(char const *verb, uint32_t blk_count, uint64_t byte_count, uint64_t lba) "%s %"PRIu32" blocks (%"PRIu64" bytes) from LBA %"PRIu64""
695
+nvme_create_sq(uint64_t addr, uint16_t sqid, uint16_t cqid, uint16_t qsize, uint16_t qflags) "create submission queue, addr=0x%"PRIx64", sqid=%"PRIu16", cqid=%"PRIu16", qsize=%"PRIu16", qflags=%"PRIu16""
696
+nvme_create_cq(uint64_t addr, uint16_t cqid, uint16_t vector, uint16_t size, uint16_t qflags, int ien) "create completion queue, addr=0x%"PRIx64", cqid=%"PRIu16", vector=%"PRIu16", qsize=%"PRIu16", qflags=%"PRIu16", ien=%d"
697
+nvme_del_sq(uint16_t qid) "deleting submission queue sqid=%"PRIu16""
698
+nvme_del_cq(uint16_t cqid) "deleted completion queue, sqid=%"PRIu16""
699
+nvme_identify_ctrl(void) "identify controller"
700
+nvme_identify_ns(uint16_t ns) "identify namespace, nsid=%"PRIu16""
701
+nvme_identify_nslist(uint16_t ns) "identify namespace list, nsid=%"PRIu16""
702
+nvme_getfeat_vwcache(char const* result) "get feature volatile write cache, result=%s"
703
+nvme_getfeat_numq(int result) "get feature number of queues, result=%d"
704
+nvme_setfeat_numq(int reqcq, int reqsq, int gotcq, int gotsq) "requested cq_count=%d sq_count=%d, responding with cq_count=%d sq_count=%d"
705
+nvme_mmio_intm_set(uint64_t data, uint64_t new_mask) "wrote MMIO, interrupt mask set, data=0x%"PRIx64", new_mask=0x%"PRIx64""
706
+nvme_mmio_intm_clr(uint64_t data, uint64_t new_mask) "wrote MMIO, interrupt mask clr, data=0x%"PRIx64", new_mask=0x%"PRIx64""
707
+nvme_mmio_cfg(uint64_t data) "wrote MMIO, config controller config=0x%"PRIx64""
708
+nvme_mmio_aqattr(uint64_t data) "wrote MMIO, admin queue attributes=0x%"PRIx64""
709
+nvme_mmio_asqaddr(uint64_t data) "wrote MMIO, admin submission queue address=0x%"PRIx64""
710
+nvme_mmio_acqaddr(uint64_t data) "wrote MMIO, admin completion queue address=0x%"PRIx64""
711
+nvme_mmio_asqaddr_hi(uint64_t data, uint64_t new_addr) "wrote MMIO, admin submission queue high half=0x%"PRIx64", new_address=0x%"PRIx64""
712
+nvme_mmio_acqaddr_hi(uint64_t data, uint64_t new_addr) "wrote MMIO, admin completion queue high half=0x%"PRIx64", new_address=0x%"PRIx64""
713
+nvme_mmio_start_success(void) "setting controller enable bit succeeded"
714
+nvme_mmio_stopped(void) "cleared controller enable bit"
715
+nvme_mmio_shutdown_set(void) "shutdown bit set"
716
+nvme_mmio_shutdown_cleared(void) "shutdown bit cleared"
717
+
718
+# nvme traces for error conditions
719
+nvme_err_invalid_dma(void) "PRP/SGL is too small for transfer size"
720
+nvme_err_invalid_prplist_ent(uint64_t prplist) "PRP list entry is null or not page aligned: 0x%"PRIx64""
721
+nvme_err_invalid_prp2_align(uint64_t prp2) "PRP2 is not page aligned: 0x%"PRIx64""
722
+nvme_err_invalid_prp2_missing(void) "PRP2 is null and more data to be transferred"
723
+nvme_err_invalid_field(void) "invalid field"
724
+nvme_err_invalid_prp(void) "invalid PRP"
725
+nvme_err_invalid_sgl(void) "invalid SGL"
726
+nvme_err_invalid_ns(uint32_t ns, uint32_t limit) "invalid namespace %u not within 1-%u"
727
+nvme_err_invalid_opc(uint8_t opc) "invalid opcode 0x%"PRIx8""
728
+nvme_err_invalid_admin_opc(uint8_t opc) "invalid admin opcode 0x%"PRIx8""
729
+nvme_err_invalid_lba_range(uint64_t start, uint64_t len, uint64_t limit) "Invalid LBA start=%"PRIu64" len=%"PRIu64" limit=%"PRIu64""
730
+nvme_err_invalid_del_sq(uint16_t qid) "invalid submission queue deletion, sid=%"PRIu16""
731
+nvme_err_invalid_create_sq_cqid(uint16_t cqid) "failed creating submission queue, invalid cqid=%"PRIu16""
732
+nvme_err_invalid_create_sq_sqid(uint16_t sqid) "failed creating submission queue, invalid sqid=%"PRIu16""
733
+nvme_err_invalid_create_sq_size(uint16_t qsize) "failed creating submission queue, invalid qsize=%"PRIu16""
734
+nvme_err_invalid_create_sq_addr(uint64_t addr) "failed creating submission queue, addr=0x%"PRIx64""
735
+nvme_err_invalid_create_sq_qflags(uint16_t qflags) "failed creating submission queue, qflags=%"PRIu16""
736
+nvme_err_invalid_del_cq_cqid(uint16_t cqid) "failed deleting completion queue, cqid=%"PRIu16""
737
+nvme_err_invalid_del_cq_notempty(uint16_t cqid) "failed deleting completion queue, it is not empty, cqid=%"PRIu16""
738
+nvme_err_invalid_create_cq_cqid(uint16_t cqid) "failed creating completion queue, cqid=%"PRIu16""
739
+nvme_err_invalid_create_cq_size(uint16_t size) "failed creating completion queue, size=%"PRIu16""
740
+nvme_err_invalid_create_cq_addr(uint64_t addr) "failed creating completion queue, addr=0x%"PRIx64""
741
+nvme_err_invalid_create_cq_vector(uint16_t vector) "failed creating completion queue, vector=%"PRIu16""
742
+nvme_err_invalid_create_cq_qflags(uint16_t qflags) "failed creating completion queue, qflags=%"PRIu16""
743
+nvme_err_invalid_identify_cns(uint16_t cns) "identify, invalid cns=0x%"PRIx16""
744
+nvme_err_invalid_getfeat(int dw10) "invalid get features, dw10=0x%"PRIx32""
745
+nvme_err_invalid_setfeat(uint32_t dw10) "invalid set features, dw10=0x%"PRIx32""
746
+nvme_err_startfail_cq(void) "nvme_start_ctrl failed because there are non-admin completion queues"
747
+nvme_err_startfail_sq(void) "nvme_start_ctrl failed because there are non-admin submission queues"
748
+nvme_err_startfail_nbarasq(void) "nvme_start_ctrl failed because the admin submission queue address is null"
749
+nvme_err_startfail_nbaracq(void) "nvme_start_ctrl failed because the admin completion queue address is null"
750
+nvme_err_startfail_asq_misaligned(uint64_t addr) "nvme_start_ctrl failed because the admin submission queue address is misaligned: 0x%"PRIx64""
751
+nvme_err_startfail_acq_misaligned(uint64_t addr) "nvme_start_ctrl failed because the admin completion queue address is misaligned: 0x%"PRIx64""
752
+nvme_err_startfail_page_too_small(uint8_t log2ps, uint8_t maxlog2ps) "nvme_start_ctrl failed because the page size is too small: log2size=%u, min=%u"
753
+nvme_err_startfail_page_too_large(uint8_t log2ps, uint8_t maxlog2ps) "nvme_start_ctrl failed because the page size is too large: log2size=%u, max=%u"
754
+nvme_err_startfail_cqent_too_small(uint8_t log2ps, uint8_t maxlog2ps) "nvme_start_ctrl failed because the completion queue entry size is too small: log2size=%u, min=%u"
755
+nvme_err_startfail_cqent_too_large(uint8_t log2ps, uint8_t maxlog2ps) "nvme_start_ctrl failed because the completion queue entry size is too large: log2size=%u, max=%u"
756
+nvme_err_startfail_sqent_too_small(uint8_t log2ps, uint8_t maxlog2ps) "nvme_start_ctrl failed because the submission queue entry size is too small: log2size=%u, min=%u"
757
+nvme_err_startfail_sqent_too_large(uint8_t log2ps, uint8_t maxlog2ps) "nvme_start_ctrl failed because the submission queue entry size is too large: log2size=%u, max=%u"
758
+nvme_err_startfail_asqent_sz_zero(void) "nvme_start_ctrl failed because the admin submission queue size is zero"
759
+nvme_err_startfail_acqent_sz_zero(void) "nvme_start_ctrl failed because the admin completion queue size is zero"
760
+nvme_err_startfail(void) "setting controller enable bit failed"
761
+
762
+# Traces for undefined behavior
763
+nvme_ub_mmiowr_misaligned32(uint64_t offset) "MMIO write not 32-bit aligned, offset=0x%"PRIx64""
764
+nvme_ub_mmiowr_toosmall(uint64_t offset, unsigned size) "MMIO write smaller than 32 bits, offset=0x%"PRIx64", size=%u"
765
+nvme_ub_mmiowr_intmask_with_msix(void) "undefined access to interrupt mask set when MSI-X is enabled"
766
+nvme_ub_mmiowr_ro_csts(void) "attempted to set a read only bit of controller status"
767
+nvme_ub_mmiowr_ssreset_w1c_unsupported(void) "attempted to W1C CSTS.NSSRO but CAP.NSSRS is zero (not supported)"
768
+nvme_ub_mmiowr_ssreset_unsupported(void) "attempted NVM subsystem reset but CAP.NSSRS is zero (not supported)"
769
+nvme_ub_mmiowr_cmbloc_reserved(void) "invalid write to reserved CMBLOC when CMBSZ is zero, ignored"
770
+nvme_ub_mmiowr_cmbsz_readonly(void) "invalid write to read only CMBSZ, ignored"
771
+nvme_ub_mmiowr_invalid(uint64_t offset, uint64_t data) "invalid MMIO write, offset=0x%"PRIx64", data=0x%"PRIx64""
772
+nvme_ub_mmiord_misaligned32(uint64_t offset) "MMIO read not 32-bit aligned, offset=0x%"PRIx64""
773
+nvme_ub_mmiord_toosmall(uint64_t offset) "MMIO read smaller than 32-bits, offset=0x%"PRIx64""
774
+nvme_ub_mmiord_invalid_ofs(uint64_t offset) "MMIO read beyond last register, offset=0x%"PRIx64", returning 0"
775
+nvme_ub_db_wr_misaligned(uint64_t offset) "doorbell write not 32-bit aligned, offset=0x%"PRIx64", ignoring"
776
+nvme_ub_db_wr_invalid_cq(uint32_t qid) "completion queue doorbell write for nonexistent queue, cqid=%"PRIu32", ignoring"
777
+nvme_ub_db_wr_invalid_cqhead(uint32_t qid, uint16_t new_head) "completion queue doorbell write value beyond queue size, cqid=%"PRIu32", new_head=%"PRIu16", ignoring"
778
+nvme_ub_db_wr_invalid_sq(uint32_t qid) "submission queue doorbell write for nonexistent queue, sqid=%"PRIu32", ignoring"
779
+nvme_ub_db_wr_invalid_sqtail(uint32_t qid, uint16_t new_tail) "submission queue doorbell write value beyond queue size, sqid=%"PRIu32", new_head=%"PRIu16", ignoring"
780
+
781
# hw/block/xen_disk.c
782
xen_disk_alloc(char *name) "%s"
783
xen_disk_init(char *name) "%s"
36
--
784
--
37
2.13.6
785
2.13.6
38
786
39
787
diff view generated by jsdifflib
1
From: Max Reitz <mreitz@redhat.com>
1
From: Fam Zheng <famz@redhat.com>
2
2
3
Currently, qemu-io only uses string-valued blockdev options (as all are
3
Management tools create overlays of running guests with qemu-img:
4
converted directly from QemuOpts) -- with one exception: -U adds the
5
force-share option as a boolean. This in itself is already a bit
6
questionable, but a real issue is that it also assumes the value already
7
existing in the options QDict would be a boolean, which is wrong.
8
4
9
That has the following effect:
5
$ qemu-img create -b /image/in/use.qcow2 -f qcow2 /overlay/image.qcow2
10
6
11
$ ./qemu-io -r -U --image-opts \
7
but this doesn't work anymore due to image locking:
12
driver=file,filename=/dev/null,force-share=off
13
[1] 15200 segmentation fault (core dumped) ./qemu-io -r -U
14
--image-opts driver=file,filename=/dev/null,force-share=off
15
8
16
Since @opts is converted from QemuOpts, the value must be a string, and
9
qemu-img: /overlay/image.qcow2: Failed to get shared "write" lock
17
we have to compare it as such. Consequently, it makes sense to also set
10
Is another process using the image?
18
it as a string instead of a boolean.
11
Could not open backing image to determine size.
12
Use the force share option to allow this use case again.
19
13
20
Cc: qemu-stable@nongnu.org
14
Cc: qemu-stable@nongnu.org
21
Signed-off-by: Max Reitz <mreitz@redhat.com>
15
Signed-off-by: Fam Zheng <famz@redhat.com>
22
Message-id: 20180502202051.15493-2-mreitz@redhat.com
23
Reviewed-by: Eric Blake <eblake@redhat.com>
16
Reviewed-by: Eric Blake <eblake@redhat.com>
24
Signed-off-by: Max Reitz <mreitz@redhat.com>
17
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
25
---
18
---
26
qemu-io.c | 4 ++--
19
block.c | 3 ++-
27
1 file changed, 2 insertions(+), 2 deletions(-)
20
1 file changed, 2 insertions(+), 1 deletion(-)
28
21
29
diff --git a/qemu-io.c b/qemu-io.c
22
diff --git a/block.c b/block.c
30
index XXXXXXX..XXXXXXX 100644
23
index XXXXXXX..XXXXXXX 100644
31
--- a/qemu-io.c
24
--- a/block.c
32
+++ b/qemu-io.c
25
+++ b/block.c
33
@@ -XXX,XX +XXX,XX @@ static int openfile(char *name, int flags, bool writethrough, bool force_share,
26
@@ -XXX,XX +XXX,XX @@ void bdrv_img_create(const char *filename, const char *fmt,
34
opts = qdict_new();
27
back_flags = flags;
28
back_flags &= ~(BDRV_O_RDWR | BDRV_O_SNAPSHOT | BDRV_O_NO_BACKING);
29
30
+ backing_options = qdict_new();
31
if (backing_fmt) {
32
- backing_options = qdict_new();
33
qdict_put_str(backing_options, "driver", backing_fmt);
35
}
34
}
36
if (qdict_haskey(opts, BDRV_OPT_FORCE_SHARE)
35
+ qdict_put_bool(backing_options, BDRV_OPT_FORCE_SHARE, true);
37
- && !qdict_get_bool(opts, BDRV_OPT_FORCE_SHARE)) {
36
38
+ && strcmp(qdict_get_str(opts, BDRV_OPT_FORCE_SHARE), "on")) {
37
bs = bdrv_open(full_backing, NULL, backing_options, back_flags,
39
error_report("-U conflicts with image options");
38
&local_err);
40
qobject_unref(opts);
41
return 1;
42
}
43
- qdict_put_bool(opts, BDRV_OPT_FORCE_SHARE, true);
44
+ qdict_put_str(opts, BDRV_OPT_FORCE_SHARE, "on");
45
}
46
qemuio_blk = blk_new_open(name, NULL, opts, flags, &local_err);
47
if (!qemuio_blk) {
48
--
39
--
49
2.13.6
40
2.13.6
50
41
51
42
diff view generated by jsdifflib
1
From: Max Reitz <mreitz@redhat.com>
1
From: Thomas Huth <thuth@redhat.com>
2
2
3
Update the rest of the filter drivers to support
3
It's not working anymore since QEMU v1.3.0 - time to remove it now.
4
BDRV_REQ_WRITE_UNCHANGED. They already forward write request flags to
5
their children, so we just have to announce support for it.
6
4
7
This patch does not cover the replication driver because that currently
5
Signed-off-by: Thomas Huth <thuth@redhat.com>
8
does not support flags at all, and because it just grabs the WRITE
6
Reviewed-by: John Snow <jsnow@redhat.com>
9
permission for its children when it can, so we should be fine just
7
Reviewed-by: Markus Armbruster <armbru@redhat.com>
10
submitting the incoming WRITE_UNCHANGED requests as normal writes.
8
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9
---
10
blockdev.c | 11 -----------
11
qemu-doc.texi | 6 ------
12
2 files changed, 17 deletions(-)
11
13
12
It also does not cover format drivers for similar reasons. They all use
14
diff --git a/blockdev.c b/blockdev.c
13
bdrv_format_default_perms() as their .bdrv_child_perm() implementation
14
so they just always grab the WRITE permission for their file children
15
whenever possible. In addition, it often would be difficult to
16
ascertain whether incoming unchanging writes end up as unchanging writes
17
in their files. So we just leave them as normal potentially changing
18
writes.
19
20
Signed-off-by: Max Reitz <mreitz@redhat.com>
21
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
22
Reviewed-by: Alberto Garcia <berto@igalia.com>
23
Message-id: 20180421132929.21610-7-mreitz@redhat.com
24
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
25
Signed-off-by: Max Reitz <mreitz@redhat.com>
26
---
27
block/blkdebug.c | 9 +++++----
28
block/blkreplay.c | 3 +++
29
block/blkverify.c | 3 +++
30
block/copy-on-read.c | 10 ++++++----
31
block/mirror.c | 2 ++
32
block/raw-format.c | 9 +++++----
33
block/throttle.c | 6 ++++--
34
7 files changed, 28 insertions(+), 14 deletions(-)
35
36
diff --git a/block/blkdebug.c b/block/blkdebug.c
37
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
38
--- a/block/blkdebug.c
16
--- a/blockdev.c
39
+++ b/block/blkdebug.c
17
+++ b/blockdev.c
40
@@ -XXX,XX +XXX,XX @@ static int blkdebug_open(BlockDriverState *bs, QDict *options, int flags,
18
@@ -XXX,XX +XXX,XX @@ QemuOptsList qemu_legacy_drive_opts = {
41
goto out;
19
.type = QEMU_OPT_STRING,
42
}
20
.help = "chs translation (auto, lba, none)",
43
21
},{
44
- bs->supported_write_flags = BDRV_REQ_FUA &
22
- .name = "boot",
45
- bs->file->bs->supported_write_flags;
23
- .type = QEMU_OPT_BOOL,
46
- bs->supported_zero_flags = (BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP) &
24
- .help = "(deprecated, ignored)",
47
- bs->file->bs->supported_zero_flags;
25
- },{
48
+ bs->supported_write_flags = BDRV_REQ_WRITE_UNCHANGED |
26
.name = "addr",
49
+ (BDRV_REQ_FUA & bs->file->bs->supported_write_flags);
27
.type = QEMU_OPT_STRING,
50
+ bs->supported_zero_flags = BDRV_REQ_WRITE_UNCHANGED |
28
.help = "pci address (virtio only)",
51
+ ((BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP) &
29
@@ -XXX,XX +XXX,XX @@ DriveInfo *drive_new(QemuOpts *all_opts, BlockInterfaceType block_default_type)
52
+ bs->file->bs->supported_zero_flags);
53
ret = -EINVAL;
54
55
/* Set alignment overrides */
56
diff --git a/block/blkreplay.c b/block/blkreplay.c
57
index XXXXXXX..XXXXXXX 100755
58
--- a/block/blkreplay.c
59
+++ b/block/blkreplay.c
60
@@ -XXX,XX +XXX,XX @@ static int blkreplay_open(BlockDriverState *bs, QDict *options, int flags,
61
goto fail;
30
goto fail;
62
}
31
}
63
32
64
+ bs->supported_write_flags = BDRV_REQ_WRITE_UNCHANGED;
33
- /* Deprecated option boot=[on|off] */
65
+ bs->supported_zero_flags = BDRV_REQ_WRITE_UNCHANGED;
34
- if (qemu_opt_get(legacy_opts, "boot") != NULL) {
66
+
35
- fprintf(stderr, "qemu-kvm: boot=on|off is deprecated and will be "
67
ret = 0;
36
- "ignored. Future versions will reject this parameter. Please "
68
fail:
37
- "update your scripts.\n");
69
return ret;
38
- }
70
diff --git a/block/blkverify.c b/block/blkverify.c
39
-
40
/* Other deprecated options */
41
if (!qtest_enabled()) {
42
for (i = 0; i < ARRAY_SIZE(deprecated); i++) {
43
diff --git a/qemu-doc.texi b/qemu-doc.texi
71
index XXXXXXX..XXXXXXX 100644
44
index XXXXXXX..XXXXXXX 100644
72
--- a/block/blkverify.c
45
--- a/qemu-doc.texi
73
+++ b/block/blkverify.c
46
+++ b/qemu-doc.texi
74
@@ -XXX,XX +XXX,XX @@ static int blkverify_open(BlockDriverState *bs, QDict *options, int flags,
47
@@ -XXX,XX +XXX,XX @@ deprecated.
75
goto fail;
48
76
}
49
@section System emulator command line arguments
77
50
78
+ bs->supported_write_flags = BDRV_REQ_WRITE_UNCHANGED;
51
-@subsection -drive boot=on|off (since 1.3.0)
79
+ bs->supported_zero_flags = BDRV_REQ_WRITE_UNCHANGED;
52
-
80
+
53
-The ``boot=on|off'' option to the ``-drive'' argument is
81
ret = 0;
54
-ignored. Applications should use the ``bootindex=N'' parameter
82
fail:
55
-to set an absolute ordering between devices instead.
83
qemu_opts_del(opts);
56
-
84
diff --git a/block/copy-on-read.c b/block/copy-on-read.c
57
@subsection -tdf (since 1.3.0)
85
index XXXXXXX..XXXXXXX 100644
58
86
--- a/block/copy-on-read.c
59
The ``-tdf'' argument is ignored. The behaviour implemented
87
+++ b/block/copy-on-read.c
88
@@ -XXX,XX +XXX,XX @@ static int cor_open(BlockDriverState *bs, QDict *options, int flags,
89
return -EINVAL;
90
}
91
92
- bs->supported_write_flags = BDRV_REQ_FUA &
93
- bs->file->bs->supported_write_flags;
94
+ bs->supported_write_flags = BDRV_REQ_WRITE_UNCHANGED |
95
+ (BDRV_REQ_FUA &
96
+ bs->file->bs->supported_write_flags);
97
98
- bs->supported_zero_flags = (BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP) &
99
- bs->file->bs->supported_zero_flags;
100
+ bs->supported_zero_flags = BDRV_REQ_WRITE_UNCHANGED |
101
+ ((BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP) &
102
+ bs->file->bs->supported_zero_flags);
103
104
return 0;
105
}
106
diff --git a/block/mirror.c b/block/mirror.c
107
index XXXXXXX..XXXXXXX 100644
108
--- a/block/mirror.c
109
+++ b/block/mirror.c
110
@@ -XXX,XX +XXX,XX @@ static void mirror_start_job(const char *job_id, BlockDriverState *bs,
111
mirror_top_bs->implicit = true;
112
}
113
mirror_top_bs->total_sectors = bs->total_sectors;
114
+ mirror_top_bs->supported_write_flags = BDRV_REQ_WRITE_UNCHANGED;
115
+ mirror_top_bs->supported_zero_flags = BDRV_REQ_WRITE_UNCHANGED;
116
bdrv_set_aio_context(mirror_top_bs, bdrv_get_aio_context(bs));
117
118
/* bdrv_append takes ownership of the mirror_top_bs reference, need to keep
119
diff --git a/block/raw-format.c b/block/raw-format.c
120
index XXXXXXX..XXXXXXX 100644
121
--- a/block/raw-format.c
122
+++ b/block/raw-format.c
123
@@ -XXX,XX +XXX,XX @@ static int raw_open(BlockDriverState *bs, QDict *options, int flags,
124
}
125
126
bs->sg = bs->file->bs->sg;
127
- bs->supported_write_flags = BDRV_REQ_FUA &
128
- bs->file->bs->supported_write_flags;
129
- bs->supported_zero_flags = (BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP) &
130
- bs->file->bs->supported_zero_flags;
131
+ bs->supported_write_flags = BDRV_REQ_WRITE_UNCHANGED |
132
+ (BDRV_REQ_FUA & bs->file->bs->supported_write_flags);
133
+ bs->supported_zero_flags = BDRV_REQ_WRITE_UNCHANGED |
134
+ ((BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP) &
135
+ bs->file->bs->supported_zero_flags);
136
137
if (bs->probed && !bdrv_is_read_only(bs)) {
138
fprintf(stderr,
139
diff --git a/block/throttle.c b/block/throttle.c
140
index XXXXXXX..XXXXXXX 100644
141
--- a/block/throttle.c
142
+++ b/block/throttle.c
143
@@ -XXX,XX +XXX,XX @@ static int throttle_open(BlockDriverState *bs, QDict *options,
144
if (!bs->file) {
145
return -EINVAL;
146
}
147
- bs->supported_write_flags = bs->file->bs->supported_write_flags;
148
- bs->supported_zero_flags = bs->file->bs->supported_zero_flags;
149
+ bs->supported_write_flags = bs->file->bs->supported_write_flags |
150
+ BDRV_REQ_WRITE_UNCHANGED;
151
+ bs->supported_zero_flags = bs->file->bs->supported_zero_flags |
152
+ BDRV_REQ_WRITE_UNCHANGED;
153
154
return throttle_configure_tgm(bs, tgm, options, errp);
155
}
156
--
60
--
157
2.13.6
61
2.13.6
158
62
159
63
diff view generated by jsdifflib
1
From: Max Reitz <mreitz@redhat.com>
1
From: Thomas Huth <thuth@redhat.com>
2
2
3
Add BDRV_REQ_WRITE_UNCHANGED to the list of flags honored during pwrite
3
It's been marked as deprecated since QEMU v2.10.0, and so far nobody
4
and pwrite_zeroes, and also add a note on when you absolutely need to
4
complained that we should keep it, so let's remove this legacy option
5
support it.
5
now to simplify the code quite a bit.
6
6
7
Signed-off-by: Max Reitz <mreitz@redhat.com>
7
Signed-off-by: Thomas Huth <thuth@redhat.com>
8
Message-id: 20180502140359.18222-1-mreitz@redhat.com
8
Reviewed-by: John Snow <jsnow@redhat.com>
9
Reviewed-by: Eric Blake <eblake@redhat.com>
9
Reviewed-by: Markus Armbruster <armbru@redhat.com>
10
Signed-off-by: Max Reitz <mreitz@redhat.com>
10
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
11
---
11
---
12
include/block/block_int.h | 18 ++++++++++++++++--
12
vl.c | 86 ++-------------------------------------------------------
13
1 file changed, 16 insertions(+), 2 deletions(-)
13
qemu-doc.texi | 8 ------
14
14
qemu-options.hx | 19 ++-----------
15
diff --git a/include/block/block_int.h b/include/block/block_int.h
15
3 files changed, 4 insertions(+), 109 deletions(-)
16
17
diff --git a/vl.c b/vl.c
16
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
17
--- a/include/block/block_int.h
19
--- a/vl.c
18
+++ b/include/block/block_int.h
20
+++ b/vl.c
19
@@ -XXX,XX +XXX,XX @@ struct BlockDriverState {
21
@@ -XXX,XX +XXX,XX @@ int main(int argc, char **argv, char **envp)
20
/* I/O Limits */
22
const char *boot_order = NULL;
21
BlockLimits bl;
23
const char *boot_once = NULL;
22
24
DisplayState *ds;
23
- /* Flags honored during pwrite (so far: BDRV_REQ_FUA) */
25
- int cyls, heads, secs, translation;
24
+ /* Flags honored during pwrite (so far: BDRV_REQ_FUA,
26
QemuOpts *opts, *machine_opts;
25
+ * BDRV_REQ_WRITE_UNCHANGED).
27
- QemuOpts *hda_opts = NULL, *icount_opts = NULL, *accel_opts = NULL;
26
+ * If a driver does not support BDRV_REQ_WRITE_UNCHANGED, those
28
+ QemuOpts *icount_opts = NULL, *accel_opts = NULL;
27
+ * writes will be issued as normal writes without the flag set.
29
QemuOptsList *olist;
28
+ * This is important to note for drivers that do not explicitly
30
int optind;
29
+ * request a WRITE permission for their children and instead take
31
const char *optarg;
30
+ * the same permissions as their parent did (this is commonly what
32
@@ -XXX,XX +XXX,XX @@ int main(int argc, char **argv, char **envp)
31
+ * block filters do). Such drivers have to be aware that the
33
32
+ * parent may have taken a WRITE_UNCHANGED permission only and is
34
cpu_model = NULL;
33
+ * issuing such requests. Drivers either must make sure that
35
snapshot = 0;
34
+ * these requests do not result in plain WRITE accesses (usually
36
- cyls = heads = secs = 0;
35
+ * by supporting BDRV_REQ_WRITE_UNCHANGED, and then forwarding
37
- translation = BIOS_ATA_TRANSLATION_AUTO;
36
+ * every incoming write request as-is, including potentially that
38
37
+ * flag), or they have to explicitly take the WRITE permission for
39
nb_nics = 0;
38
+ * their children. */
40
39
unsigned int supported_write_flags;
41
@@ -XXX,XX +XXX,XX @@ int main(int argc, char **argv, char **envp)
40
/* Flags honored during pwrite_zeroes (so far: BDRV_REQ_FUA,
42
if (optind >= argc)
41
- * BDRV_REQ_MAY_UNMAP) */
43
break;
42
+ * BDRV_REQ_MAY_UNMAP, BDRV_REQ_WRITE_UNCHANGED) */
44
if (argv[optind][0] != '-') {
43
unsigned int supported_zero_flags;
45
- hda_opts = drive_add(IF_DEFAULT, 0, argv[optind++], HD_OPTS);
44
46
+ drive_add(IF_DEFAULT, 0, argv[optind++], HD_OPTS);
45
/* the following member gives a name to every node on the bs graph. */
47
} else {
48
const QEMUOption *popt;
49
50
@@ -XXX,XX +XXX,XX @@ int main(int argc, char **argv, char **envp)
51
cpu_model = optarg;
52
break;
53
case QEMU_OPTION_hda:
54
- {
55
- char buf[256];
56
- if (cyls == 0)
57
- snprintf(buf, sizeof(buf), "%s", HD_OPTS);
58
- else
59
- snprintf(buf, sizeof(buf),
60
- "%s,cyls=%d,heads=%d,secs=%d%s",
61
- HD_OPTS , cyls, heads, secs,
62
- translation == BIOS_ATA_TRANSLATION_LBA ?
63
- ",trans=lba" :
64
- translation == BIOS_ATA_TRANSLATION_NONE ?
65
- ",trans=none" : "");
66
- drive_add(IF_DEFAULT, 0, optarg, buf);
67
- break;
68
- }
69
case QEMU_OPTION_hdb:
70
case QEMU_OPTION_hdc:
71
case QEMU_OPTION_hdd:
72
@@ -XXX,XX +XXX,XX @@ int main(int argc, char **argv, char **envp)
73
case QEMU_OPTION_snapshot:
74
snapshot = 1;
75
break;
76
- case QEMU_OPTION_hdachs:
77
- {
78
- const char *p;
79
- p = optarg;
80
- cyls = strtol(p, (char **)&p, 0);
81
- if (cyls < 1 || cyls > 16383)
82
- goto chs_fail;
83
- if (*p != ',')
84
- goto chs_fail;
85
- p++;
86
- heads = strtol(p, (char **)&p, 0);
87
- if (heads < 1 || heads > 16)
88
- goto chs_fail;
89
- if (*p != ',')
90
- goto chs_fail;
91
- p++;
92
- secs = strtol(p, (char **)&p, 0);
93
- if (secs < 1 || secs > 63)
94
- goto chs_fail;
95
- if (*p == ',') {
96
- p++;
97
- if (!strcmp(p, "large")) {
98
- translation = BIOS_ATA_TRANSLATION_LARGE;
99
- } else if (!strcmp(p, "rechs")) {
100
- translation = BIOS_ATA_TRANSLATION_RECHS;
101
- } else if (!strcmp(p, "none")) {
102
- translation = BIOS_ATA_TRANSLATION_NONE;
103
- } else if (!strcmp(p, "lba")) {
104
- translation = BIOS_ATA_TRANSLATION_LBA;
105
- } else if (!strcmp(p, "auto")) {
106
- translation = BIOS_ATA_TRANSLATION_AUTO;
107
- } else {
108
- goto chs_fail;
109
- }
110
- } else if (*p != '\0') {
111
- chs_fail:
112
- error_report("invalid physical CHS format");
113
- exit(1);
114
- }
115
- if (hda_opts != NULL) {
116
- qemu_opt_set_number(hda_opts, "cyls", cyls,
117
- &error_abort);
118
- qemu_opt_set_number(hda_opts, "heads", heads,
119
- &error_abort);
120
- qemu_opt_set_number(hda_opts, "secs", secs,
121
- &error_abort);
122
- if (translation == BIOS_ATA_TRANSLATION_LARGE) {
123
- qemu_opt_set(hda_opts, "trans", "large",
124
- &error_abort);
125
- } else if (translation == BIOS_ATA_TRANSLATION_RECHS) {
126
- qemu_opt_set(hda_opts, "trans", "rechs",
127
- &error_abort);
128
- } else if (translation == BIOS_ATA_TRANSLATION_LBA) {
129
- qemu_opt_set(hda_opts, "trans", "lba",
130
- &error_abort);
131
- } else if (translation == BIOS_ATA_TRANSLATION_NONE) {
132
- qemu_opt_set(hda_opts, "trans", "none",
133
- &error_abort);
134
- }
135
- }
136
- }
137
- error_report("'-hdachs' is deprecated, please use '-device"
138
- " ide-hd,cyls=c,heads=h,secs=s,...' instead");
139
- break;
140
case QEMU_OPTION_numa:
141
opts = qemu_opts_parse_noisily(qemu_find_opts("numa"),
142
optarg, true);
143
diff --git a/qemu-doc.texi b/qemu-doc.texi
144
index XXXXXXX..XXXXXXX 100644
145
--- a/qemu-doc.texi
146
+++ b/qemu-doc.texi
147
@@ -XXX,XX +XXX,XX @@ The ``--net dump'' argument is now replaced with the
148
``-object filter-dump'' argument which works in combination
149
with the modern ``-netdev`` backends instead.
150
151
-@subsection -hdachs (since 2.10.0)
152
-
153
-The ``-hdachs'' argument is now a synonym for setting
154
-the ``cyls'', ``heads'', ``secs'', and ``trans'' properties
155
-on the ``ide-hd'' device using the ``-device'' argument.
156
-The new syntax allows different settings to be provided
157
-per disk.
158
-
159
@subsection -usbdevice (since 2.10.0)
160
161
The ``-usbdevice DEV'' argument is now a synonym for setting
162
diff --git a/qemu-options.hx b/qemu-options.hx
163
index XXXXXXX..XXXXXXX 100644
164
--- a/qemu-options.hx
165
+++ b/qemu-options.hx
166
@@ -XXX,XX +XXX,XX @@ of available connectors of a given interface type.
167
@item media=@var{media}
168
This option defines the type of the media: disk or cdrom.
169
@item cyls=@var{c},heads=@var{h},secs=@var{s}[,trans=@var{t}]
170
-These options have the same definition as they have in @option{-hdachs}.
171
-These parameters are deprecated, use the corresponding parameters
172
+Force disk physical geometry and the optional BIOS translation (trans=none or
173
+lba). These parameters are deprecated, use the corresponding parameters
174
of @code{-device} instead.
175
@item snapshot=@var{snapshot}
176
@var{snapshot} is "on" or "off" and controls snapshot mode for the given drive
177
@@ -XXX,XX +XXX,XX @@ the raw disk image you use is not written back. You can however force
178
the write back by pressing @key{C-a s} (@pxref{disk_images}).
179
ETEXI
180
181
-DEF("hdachs", HAS_ARG, QEMU_OPTION_hdachs, \
182
- "-hdachs c,h,s[,t]\n" \
183
- " force hard disk 0 physical geometry and the optional BIOS\n" \
184
- " translation (t=none or lba) (usually QEMU can guess them)\n",
185
- QEMU_ARCH_ALL)
186
-STEXI
187
-@item -hdachs @var{c},@var{h},@var{s},[,@var{t}]
188
-@findex -hdachs
189
-Force hard disk 0 physical geometry (1 <= @var{c} <= 16383, 1 <=
190
-@var{h} <= 16, 1 <= @var{s} <= 63) and optionally force the BIOS
191
-translation mode (@var{t}=none, lba or auto). Usually QEMU can guess
192
-all those parameters. This option is deprecated, please use
193
-@code{-device ide-hd,cyls=c,heads=h,secs=s,...} instead.
194
-ETEXI
195
-
196
DEF("fsdev", HAS_ARG, QEMU_OPTION_fsdev,
197
"-fsdev fsdriver,id=id[,path=path,][security_model={mapped-xattr|mapped-file|passthrough|none}]\n"
198
" [,writeout=immediate][,readonly][,socket=socket|sock_fd=sock_fd][,fmode=fmode][,dmode=dmode]\n"
46
--
199
--
47
2.13.6
200
2.13.6
48
201
49
202
diff view generated by jsdifflib
1
From: Max Reitz <mreitz@redhat.com>
1
From: Thomas Huth <thuth@redhat.com>
2
2
3
COR across nodes (that is, you have some filter node between the
3
Looks like we forgot to announce the deprecation of these options in
4
actually COR target and the node that performs the COR) cannot reliably
4
the corresponding chapter of the qemu-doc text, so let's do that now.
5
work together with the permission system when there is no explicit COR
6
node that can request the WRITE_UNCHANGED permission for its child.
7
This is because COR (currently) sneaks its requests by the usual
8
permission checks, so it can work without a WRITE* permission; but if
9
there is a filter node in between, that will re-issue the request, which
10
then passes through the usual check -- and if nobody has requested a
11
WRITE_UNCHANGED permission, that check will fail.
12
5
13
There is no real direct fix apart from hoping that there is someone who
6
Signed-off-by: Thomas Huth <thuth@redhat.com>
14
has requested that permission; in case of just the qemu-io HMP command
7
Reviewed-by: John Snow <jsnow@redhat.com>
15
(and no guest device), however, that is not the case. The real real fix
8
Reviewed-by: Markus Armbruster <armbru@redhat.com>
16
is to implement the copy-on-read flag through an implicitly added COR
9
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
17
node. Such a node can request the necessary permissions as shown in
10
---
18
this test.
11
qemu-doc.texi | 15 +++++++++++++++
12
1 file changed, 15 insertions(+)
19
13
20
Signed-off-by: Max Reitz <mreitz@redhat.com>
14
diff --git a/qemu-doc.texi b/qemu-doc.texi
21
Message-id: 20180421132929.21610-10-mreitz@redhat.com
15
index XXXXXXX..XXXXXXX 100644
22
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
16
--- a/qemu-doc.texi
23
Signed-off-by: Max Reitz <mreitz@redhat.com>
17
+++ b/qemu-doc.texi
24
---
18
@@ -XXX,XX +XXX,XX @@ longer be directly supported in QEMU.
25
tests/qemu-iotests/216 | 115 +++++++++++++++++++++++++++++++++++++++++++++
19
The ``-drive if=scsi'' argument is replaced by the the
26
tests/qemu-iotests/216.out | 28 +++++++++++
20
``-device BUS-TYPE'' argument combined with ``-drive if=none''.
27
tests/qemu-iotests/group | 1 +
21
28
3 files changed, 144 insertions(+)
22
+@subsection -drive cyls=...,heads=...,secs=...,trans=... (since 2.10.0)
29
create mode 100755 tests/qemu-iotests/216
30
create mode 100644 tests/qemu-iotests/216.out
31
32
diff --git a/tests/qemu-iotests/216 b/tests/qemu-iotests/216
33
new file mode 100755
34
index XXXXXXX..XXXXXXX
35
--- /dev/null
36
+++ b/tests/qemu-iotests/216
37
@@ -XXX,XX +XXX,XX @@
38
+#!/usr/bin/env python
39
+#
40
+# Copy-on-read tests using a COR filter node
41
+#
42
+# Copyright (C) 2018 Red Hat, Inc.
43
+#
44
+# This program is free software; you can redistribute it and/or modify
45
+# it under the terms of the GNU General Public License as published by
46
+# the Free Software Foundation; either version 2 of the License, or
47
+# (at your option) any later version.
48
+#
49
+# This program is distributed in the hope that it will be useful,
50
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
51
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
52
+# GNU General Public License for more details.
53
+#
54
+# You should have received a copy of the GNU General Public License
55
+# along with this program. If not, see <http://www.gnu.org/licenses/>.
56
+#
57
+# Creator/Owner: Max Reitz <mreitz@redhat.com>
58
+
23
+
59
+import iotests
24
+The drive geometry arguments are replaced by the the geometry arguments
60
+from iotests import log, qemu_img_pipe, qemu_io, filter_qemu_io
25
+that can be specified with the ``-device'' parameter.
61
+
26
+
62
+# Need backing file support
27
+@subsection -drive serial=... (since 2.10.0)
63
+iotests.verify_image_format(supported_fmts=['qcow2', 'qcow', 'qed', 'vmdk'])
64
+iotests.verify_platform(['linux'])
65
+
28
+
66
+log('')
29
+The drive serial argument is replaced by the the serial argument
67
+log('=== Copy-on-read across nodes ===')
30
+that can be specified with the ``-device'' parameter.
68
+log('')
69
+
31
+
70
+# The old copy-on-read mechanism without a filter node cannot request
32
+@subsection -drive addr=... (since 2.10.0)
71
+# WRITE_UNCHANGED permissions for its child. Therefore it just tries
72
+# to sneak its write by the usual permission system and holds its
73
+# fingers crossed. However, that sneaking does not work so well when
74
+# there is a filter node in the way: That will receive the write
75
+# request and re-issue a new one to its child, which this time is a
76
+# proper write request that will make the permission system cough --
77
+# unless there is someone at the top (like a guest device) that has
78
+# requested write permissions.
79
+#
80
+# A COR filter node, however, can request the proper permissions for
81
+# its child and therefore is not hit by this issue.
82
+
33
+
83
+with iotests.FilePath('base.img') as base_img_path, \
34
+The drive addr argument is replaced by the the addr argument
84
+ iotests.FilePath('top.img') as top_img_path, \
35
+that can be specified with the ``-device'' parameter.
85
+ iotests.VM() as vm:
86
+
36
+
87
+ log('--- Setting up images ---')
37
@subsection -net dump (since 2.10.0)
88
+ log('')
38
89
+
39
The ``--net dump'' argument is now replaced with the
90
+ qemu_img_pipe('create', '-f', iotests.imgfmt, base_img_path, '64M')
91
+
92
+ log(filter_qemu_io(qemu_io(base_img_path, '-c', 'write -P 1 0M 1M')))
93
+
94
+ qemu_img_pipe('create', '-f', iotests.imgfmt, '-b', base_img_path,
95
+ top_img_path)
96
+
97
+ log(filter_qemu_io(qemu_io(top_img_path, '-c', 'write -P 2 1M 1M')))
98
+
99
+ log('')
100
+ log('--- Doing COR ---')
101
+ log('')
102
+
103
+ # Compare with e.g. the following:
104
+ # vm.add_drive_raw('if=none,node-name=node0,copy-on-read=on,driver=raw,' \
105
+ # 'file.driver=%s,file.file.filename=%s' %
106
+ # (iotests.imgfmt, top_img_path))
107
+ # (Remove the blockdev-add instead.)
108
+ # ((Not tested here because it hits an assertion in the permission
109
+ # system.))
110
+
111
+ vm.launch()
112
+
113
+ log(vm.qmp('blockdev-add',
114
+ node_name='node0',
115
+ driver='copy-on-read',
116
+ file={
117
+ 'driver': 'raw',
118
+ 'file': {
119
+ 'driver': 'copy-on-read',
120
+ 'file': {
121
+ 'driver': 'raw',
122
+ 'file': {
123
+ 'driver': iotests.imgfmt,
124
+ 'file': {
125
+ 'driver': 'file',
126
+ 'filename': top_img_path
127
+ },
128
+ 'backing': {
129
+ 'driver': iotests.imgfmt,
130
+ 'file': {
131
+ 'driver': 'file',
132
+ 'filename': base_img_path
133
+ }
134
+ }
135
+ }
136
+ }
137
+ }
138
+ }))
139
+
140
+ # Trigger COR
141
+ log(vm.qmp('human-monitor-command',
142
+ command_line='qemu-io node0 "read 0 64M"'))
143
+
144
+ vm.shutdown()
145
+
146
+ log('')
147
+ log('--- Checking COR result ---')
148
+ log('')
149
+
150
+ log(filter_qemu_io(qemu_io(base_img_path, '-c', 'discard 0 64M')))
151
+ log(filter_qemu_io(qemu_io(top_img_path, '-c', 'read -P 1 0M 1M')))
152
+ log(filter_qemu_io(qemu_io(top_img_path, '-c', 'read -P 2 1M 1M')))
153
diff --git a/tests/qemu-iotests/216.out b/tests/qemu-iotests/216.out
154
new file mode 100644
155
index XXXXXXX..XXXXXXX
156
--- /dev/null
157
+++ b/tests/qemu-iotests/216.out
158
@@ -XXX,XX +XXX,XX @@
159
+
160
+=== Copy-on-read across nodes ===
161
+
162
+--- Setting up images ---
163
+
164
+wrote 1048576/1048576 bytes at offset 0
165
+1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
166
+
167
+wrote 1048576/1048576 bytes at offset 1048576
168
+1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
169
+
170
+
171
+--- Doing COR ---
172
+
173
+{u'return': {}}
174
+{u'return': u''}
175
+
176
+--- Checking COR result ---
177
+
178
+discard 67108864/67108864 bytes at offset 0
179
+64 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
180
+
181
+read 1048576/1048576 bytes at offset 0
182
+1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
183
+
184
+read 1048576/1048576 bytes at offset 1048576
185
+1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
186
+
187
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
188
index XXXXXXX..XXXXXXX 100644
189
--- a/tests/qemu-iotests/group
190
+++ b/tests/qemu-iotests/group
191
@@ -XXX,XX +XXX,XX @@
192
213 rw auto quick
193
214 rw auto
194
215 rw auto quick
195
+216 rw auto quick
196
218 rw auto quick
197
--
40
--
198
2.13.6
41
2.13.6
199
42
200
43
diff view generated by jsdifflib
1
From: Eric Blake <eblake@redhat.com>
1
From: Fam Zheng <famz@redhat.com>
2
2
3
We are gradually moving away from sector-based interfaces, towards
3
Signed-off-by: Fam Zheng <famz@redhat.com>
4
byte-based. Now that all drivers with aio callbacks are using the
5
byte-based interfaces, we can remove the sector-based versions.
6
7
Signed-off-by: Eric Blake <eblake@redhat.com>
8
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
4
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9
---
5
---
10
include/block/block_int.h | 6 ----
6
include/block/block_int.h | 1 -
11
block/io.c | 84 ++++++++++++++++++++---------------------------
7
block/io.c | 18 ------------------
12
2 files changed, 36 insertions(+), 54 deletions(-)
8
2 files changed, 19 deletions(-)
13
9
14
diff --git a/include/block/block_int.h b/include/block/block_int.h
10
diff --git a/include/block/block_int.h b/include/block/block_int.h
15
index XXXXXXX..XXXXXXX 100644
11
index XXXXXXX..XXXXXXX 100644
16
--- a/include/block/block_int.h
12
--- a/include/block/block_int.h
17
+++ b/include/block/block_int.h
13
+++ b/include/block/block_int.h
18
@@ -XXX,XX +XXX,XX @@ struct BlockDriver {
14
@@ -XXX,XX +XXX,XX @@ bool blk_dev_is_tray_open(BlockBackend *blk);
19
void (*bdrv_refresh_filename)(BlockDriverState *bs, QDict *options);
15
bool blk_dev_is_medium_locked(BlockBackend *blk);
20
16
21
/* aio */
17
void bdrv_set_dirty(BlockDriverState *bs, int64_t offset, int64_t bytes);
22
- BlockAIOCB *(*bdrv_aio_readv)(BlockDriverState *bs,
18
-bool bdrv_requests_pending(BlockDriverState *bs);
23
- int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
19
24
- BlockCompletionFunc *cb, void *opaque);
20
void bdrv_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap, HBitmap **out);
25
BlockAIOCB *(*bdrv_aio_preadv)(BlockDriverState *bs,
21
void bdrv_undo_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap, HBitmap *in);
26
uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags,
27
BlockCompletionFunc *cb, void *opaque);
28
- BlockAIOCB *(*bdrv_aio_writev)(BlockDriverState *bs,
29
- int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
30
- BlockCompletionFunc *cb, void *opaque);
31
BlockAIOCB *(*bdrv_aio_pwritev)(BlockDriverState *bs,
32
uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags,
33
BlockCompletionFunc *cb, void *opaque);
34
diff --git a/block/io.c b/block/io.c
22
diff --git a/block/io.c b/block/io.c
35
index XXXXXXX..XXXXXXX 100644
23
index XXXXXXX..XXXXXXX 100644
36
--- a/block/io.c
24
--- a/block/io.c
37
+++ b/block/io.c
25
+++ b/block/io.c
38
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn bdrv_driver_preadv(BlockDriverState *bs,
26
@@ -XXX,XX +XXX,XX @@ void bdrv_disable_copy_on_read(BlockDriverState *bs)
39
return drv->bdrv_co_preadv(bs, offset, bytes, qiov, flags);
27
assert(old >= 1);
40
}
28
}
41
29
42
- /* FIXME - no need to calculate these if .bdrv_aio_preadv exists */
30
-/* Check if any requests are in-flight (including throttled requests) */
43
- sector_num = offset >> BDRV_SECTOR_BITS;
31
-bool bdrv_requests_pending(BlockDriverState *bs)
44
- nb_sectors = bytes >> BDRV_SECTOR_BITS;
32
-{
33
- BdrvChild *child;
45
-
34
-
46
- if (!drv->bdrv_aio_preadv) {
35
- if (atomic_read(&bs->in_flight)) {
47
- assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
36
- return true;
48
- assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
49
- assert((bytes >> BDRV_SECTOR_BITS) <= BDRV_REQUEST_MAX_SECTORS);
50
- }
37
- }
51
-
38
-
52
- if (drv->bdrv_co_readv) {
39
- QLIST_FOREACH(child, &bs->children, next) {
53
- return drv->bdrv_co_readv(bs, sector_num, nb_sectors, qiov);
40
- if (bdrv_requests_pending(child->bs)) {
54
- } else {
41
- return true;
55
+ if (drv->bdrv_aio_preadv) {
56
BlockAIOCB *acb;
57
CoroutineIOCompletion co = {
58
.coroutine = qemu_coroutine_self(),
59
};
60
61
- if (drv->bdrv_aio_preadv) {
62
- acb = drv->bdrv_aio_preadv(bs, offset, bytes, qiov, flags,
63
- bdrv_co_io_em_complete, &co);
64
- } else {
65
- acb = drv->bdrv_aio_readv(bs, sector_num, qiov, nb_sectors,
66
- bdrv_co_io_em_complete, &co);
67
- }
42
- }
68
+ acb = drv->bdrv_aio_preadv(bs, offset, bytes, qiov, flags,
69
+ bdrv_co_io_em_complete, &co);
70
if (acb == NULL) {
71
return -EIO;
72
} else {
73
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn bdrv_driver_preadv(BlockDriverState *bs,
74
return co.ret;
75
}
76
}
77
+
78
+ sector_num = offset >> BDRV_SECTOR_BITS;
79
+ nb_sectors = bytes >> BDRV_SECTOR_BITS;
80
+
81
+ assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
82
+ assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
83
+ assert((bytes >> BDRV_SECTOR_BITS) <= BDRV_REQUEST_MAX_SECTORS);
84
+ assert(drv->bdrv_co_readv);
85
+
86
+ return drv->bdrv_co_readv(bs, sector_num, nb_sectors, qiov);
87
}
88
89
static int coroutine_fn bdrv_driver_pwritev(BlockDriverState *bs,
90
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn bdrv_driver_pwritev(BlockDriverState *bs,
91
goto emulate_flags;
92
}
93
94
- /* FIXME - no need to calculate these if .bdrv_aio_pwritev exists */
95
- sector_num = offset >> BDRV_SECTOR_BITS;
96
- nb_sectors = bytes >> BDRV_SECTOR_BITS;
97
-
98
- if (!drv->bdrv_aio_pwritev) {
99
- assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
100
- assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
101
- assert((bytes >> BDRV_SECTOR_BITS) <= BDRV_REQUEST_MAX_SECTORS);
102
- }
43
- }
103
-
44
-
104
- if (drv->bdrv_co_writev_flags) {
45
- return false;
105
- ret = drv->bdrv_co_writev_flags(bs, sector_num, nb_sectors, qiov,
46
-}
106
- flags & bs->supported_write_flags);
47
-
107
- flags &= ~bs->supported_write_flags;
48
typedef struct {
108
- } else if (drv->bdrv_co_writev) {
49
Coroutine *co;
109
- assert(!bs->supported_write_flags);
50
BlockDriverState *bs;
110
- ret = drv->bdrv_co_writev(bs, sector_num, nb_sectors, qiov);
111
- } else {
112
+ if (drv->bdrv_aio_pwritev) {
113
BlockAIOCB *acb;
114
CoroutineIOCompletion co = {
115
.coroutine = qemu_coroutine_self(),
116
};
117
118
- if (drv->bdrv_aio_pwritev) {
119
- acb = drv->bdrv_aio_pwritev(bs, offset, bytes, qiov,
120
- flags & bs->supported_write_flags,
121
- bdrv_co_io_em_complete, &co);
122
- flags &= ~bs->supported_write_flags;
123
- } else {
124
- assert(!bs->supported_write_flags);
125
- acb = drv->bdrv_aio_writev(bs, sector_num, qiov, nb_sectors,
126
- bdrv_co_io_em_complete, &co);
127
- }
128
+ acb = drv->bdrv_aio_pwritev(bs, offset, bytes, qiov,
129
+ flags & bs->supported_write_flags,
130
+ bdrv_co_io_em_complete, &co);
131
+ flags &= ~bs->supported_write_flags;
132
if (acb == NULL) {
133
ret = -EIO;
134
} else {
135
qemu_coroutine_yield();
136
ret = co.ret;
137
}
138
+ goto emulate_flags;
139
+ }
140
+
141
+ sector_num = offset >> BDRV_SECTOR_BITS;
142
+ nb_sectors = bytes >> BDRV_SECTOR_BITS;
143
+
144
+ assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
145
+ assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
146
+ assert((bytes >> BDRV_SECTOR_BITS) <= BDRV_REQUEST_MAX_SECTORS);
147
+
148
+ if (drv->bdrv_co_writev_flags) {
149
+ ret = drv->bdrv_co_writev_flags(bs, sector_num, nb_sectors, qiov,
150
+ flags & bs->supported_write_flags);
151
+ flags &= ~bs->supported_write_flags;
152
+ } else {
153
+ assert(drv->bdrv_co_writev);
154
+ assert(!bs->supported_write_flags);
155
+ ret = drv->bdrv_co_writev(bs, sector_num, nb_sectors, qiov);
156
}
157
158
emulate_flags:
159
--
51
--
160
2.13.6
52
2.13.6
161
53
162
54
diff view generated by jsdifflib
1
From: Max Reitz <mreitz@redhat.com>
1
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2
2
Reviewed-by: Fam Zheng <famz@redhat.com>
3
Signed-off-by: Max Reitz <mreitz@redhat.com>
4
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
5
Reviewed-by: Alberto Garcia <berto@igalia.com>
6
Message-id: 20180421132929.21610-5-mreitz@redhat.com
7
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
8
Signed-off-by: Max Reitz <mreitz@redhat.com>
9
---
3
---
10
block/io.c | 6 ++++--
4
block/io.c | 6 ++++++
11
1 file changed, 4 insertions(+), 2 deletions(-)
5
1 file changed, 6 insertions(+)
12
6
13
diff --git a/block/io.c b/block/io.c
7
diff --git a/block/io.c b/block/io.c
14
index XXXXXXX..XXXXXXX 100644
8
index XXXXXXX..XXXXXXX 100644
15
--- a/block/io.c
9
--- a/block/io.c
16
+++ b/block/io.c
10
+++ b/block/io.c
17
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn bdrv_co_do_copy_on_readv(BdrvChild *child,
11
@@ -XXX,XX +XXX,XX @@ void bdrv_drain_all_begin(void)
18
/* FIXME: Should we (perhaps conditionally) be setting
12
BdrvNextIterator it;
19
* BDRV_REQ_MAY_UNMAP, if it will allow for a sparser copy
13
GSList *aio_ctxs = NULL, *ctx;
20
* that still correctly reads as zero? */
14
21
- ret = bdrv_co_do_pwrite_zeroes(bs, cluster_offset, pnum, 0);
15
+ /* BDRV_POLL_WHILE() for a node can only be called from its own I/O thread
22
+ ret = bdrv_co_do_pwrite_zeroes(bs, cluster_offset, pnum,
16
+ * or the main loop AioContext. We potentially use BDRV_POLL_WHILE() on
23
+ BDRV_REQ_WRITE_UNCHANGED);
17
+ * nodes in several different AioContexts, so make sure we're in the main
24
} else {
18
+ * context. */
25
/* This does not change the data on the disk, it is not
19
+ assert(qemu_get_current_aio_context() == qemu_get_aio_context());
26
* necessary to flush even in cache=writethrough mode.
20
+
27
*/
21
block_job_pause_all();
28
ret = bdrv_driver_pwritev(bs, cluster_offset, pnum,
22
29
- &local_qiov, 0);
23
for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
30
+ &local_qiov,
31
+ BDRV_REQ_WRITE_UNCHANGED);
32
}
33
34
if (ret < 0) {
35
--
24
--
36
2.13.6
25
2.13.6
37
26
38
27
diff view generated by jsdifflib
1
All block job drivers support .set_speed and all of them duplicate the
1
bdrv_drained_begin() doesn't increase bs->quiesce_counter recursively
2
same code to implement it. Move that code to blockjob.c and remove the
2
and also doesn't notify other parent nodes of children, which both means
3
now useless callback.
3
that the child nodes are not actually drained, and bdrv_drained_begin()
4
is providing useful functionality only on a single node.
5
6
To keep things consistent, we also shouldn't call the block driver
7
callbacks recursively.
8
9
A proper recursive drain version that provides an actually working
10
drained section for child nodes will be introduced later.
4
11
5
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
12
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
6
Reviewed-by: Eric Blake <eblake@redhat.com>
13
Reviewed-by: Fam Zheng <famz@redhat.com>
7
Reviewed-by: Max Reitz <mreitz@redhat.com>
8
Reviewed-by: John Snow <jsnow@redhat.com>
9
---
14
---
10
include/block/blockjob.h | 2 ++
15
block/io.c | 16 +++++++++-------
11
include/block/blockjob_int.h | 3 ---
16
1 file changed, 9 insertions(+), 7 deletions(-)
12
block/backup.c | 13 -------------
13
block/commit.c | 14 --------------
14
block/mirror.c | 26 ++++++--------------------
15
block/stream.c | 14 --------------
16
blockjob.c | 12 ++++--------
17
7 files changed, 12 insertions(+), 72 deletions(-)
18
17
19
diff --git a/include/block/blockjob.h b/include/block/blockjob.h
18
diff --git a/block/io.c b/block/io.c
20
index XXXXXXX..XXXXXXX 100644
19
index XXXXXXX..XXXXXXX 100644
21
--- a/include/block/blockjob.h
20
--- a/block/io.c
22
+++ b/include/block/blockjob.h
21
+++ b/block/io.c
23
@@ -XXX,XX +XXX,XX @@
22
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn bdrv_drain_invoke_entry(void *opaque)
24
#include "block/block.h"
25
#include "qemu/ratelimit.h"
26
27
+#define BLOCK_JOB_SLICE_TIME 100000000ULL /* ns */
28
+
29
typedef struct BlockJobDriver BlockJobDriver;
30
typedef struct BlockJobTxn BlockJobTxn;
31
32
diff --git a/include/block/blockjob_int.h b/include/block/blockjob_int.h
33
index XXXXXXX..XXXXXXX 100644
34
--- a/include/block/blockjob_int.h
35
+++ b/include/block/blockjob_int.h
36
@@ -XXX,XX +XXX,XX @@ struct BlockJobDriver {
37
/** String describing the operation, part of query-block-jobs QMP API */
38
BlockJobType job_type;
39
40
- /** Optional callback for job types that support setting a speed limit */
41
- void (*set_speed)(BlockJob *job, int64_t speed, Error **errp);
42
-
43
/** Mandatory: Entrypoint for the Coroutine. */
44
CoroutineEntry *start;
45
46
diff --git a/block/backup.c b/block/backup.c
47
index XXXXXXX..XXXXXXX 100644
48
--- a/block/backup.c
49
+++ b/block/backup.c
50
@@ -XXX,XX +XXX,XX @@
51
#include "qemu/error-report.h"
52
53
#define BACKUP_CLUSTER_SIZE_DEFAULT (1 << 16)
54
-#define SLICE_TIME 100000000ULL /* ns */
55
56
typedef struct BackupBlockJob {
57
BlockJob common;
58
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn backup_before_write_notify(
59
return backup_do_cow(job, req->offset, req->bytes, NULL, true);
60
}
23
}
61
24
62
-static void backup_set_speed(BlockJob *job, int64_t speed, Error **errp)
25
/* Recursively call BlockDriver.bdrv_co_drain_begin/end callbacks */
63
-{
26
-static void bdrv_drain_invoke(BlockDriverState *bs, bool begin)
64
- BackupBlockJob *s = container_of(job, BackupBlockJob, common);
27
+static void bdrv_drain_invoke(BlockDriverState *bs, bool begin, bool recursive)
65
-
66
- if (speed < 0) {
67
- error_setg(errp, QERR_INVALID_PARAMETER, "speed");
68
- return;
69
- }
70
- ratelimit_set_speed(&s->common.limit, speed, SLICE_TIME);
71
-}
72
-
73
static void backup_cleanup_sync_bitmap(BackupBlockJob *job, int ret)
74
{
28
{
75
BdrvDirtyBitmap *bm;
29
BdrvChild *child, *tmp;
76
@@ -XXX,XX +XXX,XX @@ static const BlockJobDriver backup_job_driver = {
30
BdrvCoDrainData data = { .bs = bs, .done = false, .begin = begin};
77
.instance_size = sizeof(BackupBlockJob),
31
@@ -XXX,XX +XXX,XX @@ static void bdrv_drain_invoke(BlockDriverState *bs, bool begin)
78
.job_type = BLOCK_JOB_TYPE_BACKUP,
32
bdrv_coroutine_enter(bs, data.co);
79
.start = backup_run,
33
BDRV_POLL_WHILE(bs, !data.done);
80
- .set_speed = backup_set_speed,
34
81
.commit = backup_commit,
35
- QLIST_FOREACH_SAFE(child, &bs->children, next, tmp) {
82
.abort = backup_abort,
36
- bdrv_drain_invoke(child->bs, begin);
83
.clean = backup_clean,
37
+ if (recursive) {
84
diff --git a/block/commit.c b/block/commit.c
38
+ QLIST_FOREACH_SAFE(child, &bs->children, next, tmp) {
85
index XXXXXXX..XXXXXXX 100644
39
+ bdrv_drain_invoke(child->bs, begin, true);
86
--- a/block/commit.c
40
+ }
87
+++ b/block/commit.c
41
}
88
@@ -XXX,XX +XXX,XX @@ enum {
89
COMMIT_BUFFER_SIZE = 512 * 1024, /* in bytes */
90
};
91
92
-#define SLICE_TIME 100000000ULL /* ns */
93
-
94
typedef struct CommitBlockJob {
95
BlockJob common;
96
BlockDriverState *commit_top_bs;
97
@@ -XXX,XX +XXX,XX @@ out:
98
block_job_defer_to_main_loop(&s->common, commit_complete, data);
99
}
42
}
100
43
101
-static void commit_set_speed(BlockJob *job, int64_t speed, Error **errp)
44
@@ -XXX,XX +XXX,XX @@ void bdrv_drained_begin(BlockDriverState *bs)
102
-{
45
bdrv_parent_drained_begin(bs);
103
- CommitBlockJob *s = container_of(job, CommitBlockJob, common);
46
}
104
-
47
105
- if (speed < 0) {
48
- bdrv_drain_invoke(bs, true);
106
- error_setg(errp, QERR_INVALID_PARAMETER, "speed");
49
+ bdrv_drain_invoke(bs, true, false);
107
- return;
50
bdrv_drain_recurse(bs);
108
- }
109
- ratelimit_set_speed(&s->common.limit, speed, SLICE_TIME);
110
-}
111
-
112
static const BlockJobDriver commit_job_driver = {
113
.instance_size = sizeof(CommitBlockJob),
114
.job_type = BLOCK_JOB_TYPE_COMMIT,
115
- .set_speed = commit_set_speed,
116
.start = commit_run,
117
};
118
119
diff --git a/block/mirror.c b/block/mirror.c
120
index XXXXXXX..XXXXXXX 100644
121
--- a/block/mirror.c
122
+++ b/block/mirror.c
123
@@ -XXX,XX +XXX,XX @@
124
#include "qemu/ratelimit.h"
125
#include "qemu/bitmap.h"
126
127
-#define SLICE_TIME 100000000ULL /* ns */
128
#define MAX_IN_FLIGHT 16
129
#define MAX_IO_BYTES (1 << 20) /* 1 Mb */
130
#define DEFAULT_MIRROR_BUF_SIZE (MAX_IN_FLIGHT * MAX_IO_BYTES)
131
@@ -XXX,XX +XXX,XX @@ static void mirror_throttle(MirrorBlockJob *s)
132
{
133
int64_t now = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
134
135
- if (now - s->last_pause_ns > SLICE_TIME) {
136
+ if (now - s->last_pause_ns > BLOCK_JOB_SLICE_TIME) {
137
s->last_pause_ns = now;
138
block_job_sleep_ns(&s->common, 0);
139
} else {
140
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn mirror_run(void *opaque)
141
142
/* Note that even when no rate limit is applied we need to yield
143
* periodically with no pending I/O so that bdrv_drain_all() returns.
144
- * We do so every SLICE_TIME nanoseconds, or when there is an error,
145
- * or when the source is clean, whichever comes first.
146
- */
147
+ * We do so every BLKOCK_JOB_SLICE_TIME nanoseconds, or when there is
148
+ * an error, or when the source is clean, whichever comes first. */
149
delta = qemu_clock_get_ns(QEMU_CLOCK_REALTIME) - s->last_pause_ns;
150
- if (delta < SLICE_TIME &&
151
+ if (delta < BLOCK_JOB_SLICE_TIME &&
152
s->common.iostatus == BLOCK_DEVICE_IO_STATUS_OK) {
153
if (s->in_flight >= MAX_IN_FLIGHT || s->buf_free_count == 0 ||
154
(cnt == 0 && s->in_flight > 0)) {
155
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn mirror_run(void *opaque)
156
ret = 0;
157
158
if (s->synced && !should_complete) {
159
- delay_ns = (s->in_flight == 0 && cnt == 0 ? SLICE_TIME : 0);
160
+ delay_ns = (s->in_flight == 0 &&
161
+ cnt == 0 ? BLOCK_JOB_SLICE_TIME : 0);
162
}
163
trace_mirror_before_sleep(s, cnt, s->synced, delay_ns);
164
block_job_sleep_ns(&s->common, delay_ns);
165
@@ -XXX,XX +XXX,XX @@ immediate_exit:
166
block_job_defer_to_main_loop(&s->common, mirror_exit, data);
167
}
51
}
168
52
169
-static void mirror_set_speed(BlockJob *job, int64_t speed, Error **errp)
53
@@ -XXX,XX +XXX,XX @@ void bdrv_drained_end(BlockDriverState *bs)
170
-{
54
}
171
- MirrorBlockJob *s = container_of(job, MirrorBlockJob, common);
55
172
-
56
/* Re-enable things in child-to-parent order */
173
- if (speed < 0) {
57
- bdrv_drain_invoke(bs, false);
174
- error_setg(errp, QERR_INVALID_PARAMETER, "speed");
58
+ bdrv_drain_invoke(bs, false, false);
175
- return;
59
bdrv_parent_drained_end(bs);
176
- }
60
aio_enable_external(bdrv_get_aio_context(bs));
177
- ratelimit_set_speed(&s->common.limit, speed, SLICE_TIME);
178
-}
179
-
180
static void mirror_complete(BlockJob *job, Error **errp)
181
{
182
MirrorBlockJob *s = container_of(job, MirrorBlockJob, common);
183
@@ -XXX,XX +XXX,XX @@ static void mirror_drain(BlockJob *job)
184
static const BlockJobDriver mirror_job_driver = {
185
.instance_size = sizeof(MirrorBlockJob),
186
.job_type = BLOCK_JOB_TYPE_MIRROR,
187
- .set_speed = mirror_set_speed,
188
.start = mirror_run,
189
.complete = mirror_complete,
190
.pause = mirror_pause,
191
@@ -XXX,XX +XXX,XX @@ static const BlockJobDriver mirror_job_driver = {
192
static const BlockJobDriver commit_active_job_driver = {
193
.instance_size = sizeof(MirrorBlockJob),
194
.job_type = BLOCK_JOB_TYPE_COMMIT,
195
- .set_speed = mirror_set_speed,
196
.start = mirror_run,
197
.complete = mirror_complete,
198
.pause = mirror_pause,
199
diff --git a/block/stream.c b/block/stream.c
200
index XXXXXXX..XXXXXXX 100644
201
--- a/block/stream.c
202
+++ b/block/stream.c
203
@@ -XXX,XX +XXX,XX @@ enum {
204
STREAM_BUFFER_SIZE = 512 * 1024, /* in bytes */
205
};
206
207
-#define SLICE_TIME 100000000ULL /* ns */
208
-
209
typedef struct StreamBlockJob {
210
BlockJob common;
211
BlockDriverState *base;
212
@@ -XXX,XX +XXX,XX @@ out:
213
block_job_defer_to_main_loop(&s->common, stream_complete, data);
214
}
61
}
215
62
@@ -XXX,XX +XXX,XX @@ void bdrv_drain_all_begin(void)
216
-static void stream_set_speed(BlockJob *job, int64_t speed, Error **errp)
63
aio_context_acquire(aio_context);
217
-{
64
aio_disable_external(aio_context);
218
- StreamBlockJob *s = container_of(job, StreamBlockJob, common);
65
bdrv_parent_drained_begin(bs);
219
-
66
- bdrv_drain_invoke(bs, true);
220
- if (speed < 0) {
67
+ bdrv_drain_invoke(bs, true, true);
221
- error_setg(errp, QERR_INVALID_PARAMETER, "speed");
68
aio_context_release(aio_context);
222
- return;
69
223
- }
70
if (!g_slist_find(aio_ctxs, aio_context)) {
224
- ratelimit_set_speed(&s->common.limit, speed, SLICE_TIME);
71
@@ -XXX,XX +XXX,XX @@ void bdrv_drain_all_end(void)
225
-}
72
226
-
73
/* Re-enable things in child-to-parent order */
227
static const BlockJobDriver stream_job_driver = {
74
aio_context_acquire(aio_context);
228
.instance_size = sizeof(StreamBlockJob),
75
- bdrv_drain_invoke(bs, false);
229
.job_type = BLOCK_JOB_TYPE_STREAM,
76
+ bdrv_drain_invoke(bs, false, true);
230
- .set_speed = stream_set_speed,
77
bdrv_parent_drained_end(bs);
231
.start = stream_run,
78
aio_enable_external(aio_context);
232
};
79
aio_context_release(aio_context);
233
234
diff --git a/blockjob.c b/blockjob.c
235
index XXXXXXX..XXXXXXX 100644
236
--- a/blockjob.c
237
+++ b/blockjob.c
238
@@ -XXX,XX +XXX,XX @@ static bool block_job_timer_pending(BlockJob *job)
239
240
void block_job_set_speed(BlockJob *job, int64_t speed, Error **errp)
241
{
242
- Error *local_err = NULL;
243
int64_t old_speed = job->speed;
244
245
- if (!job->driver->set_speed) {
246
- error_setg(errp, QERR_UNSUPPORTED);
247
- return;
248
- }
249
if (block_job_apply_verb(job, BLOCK_JOB_VERB_SET_SPEED, errp)) {
250
return;
251
}
252
- job->driver->set_speed(job, speed, &local_err);
253
- if (local_err) {
254
- error_propagate(errp, local_err);
255
+ if (speed < 0) {
256
+ error_setg(errp, QERR_INVALID_PARAMETER, "speed");
257
return;
258
}
259
260
+ ratelimit_set_speed(&job->limit, speed, BLOCK_JOB_SLICE_TIME);
261
+
262
job->speed = speed;
263
if (speed && speed <= old_speed) {
264
return;
265
--
80
--
266
2.13.6
81
2.13.6
267
82
268
83
diff view generated by jsdifflib
1
From: Eric Blake <eblake@redhat.com>
1
The existing test is for bdrv_drain_all_begin/end() only. Generalise the
2
test case so that it can be run for the other variants as well. At the
3
moment this is only bdrv_drain_begin/end(), but in a while, we'll add
4
another one.
2
5
3
We are gradually moving away from sector-based interfaces, towards
6
Also, add a backing file to the test node to test whether the operations
4
byte-based. Make the change for the last few sector-based callbacks
7
work recursively.
5
in the rbd driver.
6
8
7
Note that the driver was already using byte-based calls for
8
performing actual I/O, so this just gets rid of a round trip
9
of scaling; however, as I don't know if RBD is tolerant of
10
non-sector AIO operations, I went with the conservate approach
11
of adding .bdrv_refresh_limits to override the block layer
12
defaults back to the pre-patch value of 512.
13
14
Signed-off-by: Eric Blake <eblake@redhat.com>
15
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
16
---
10
---
17
block/rbd.c | 40 ++++++++++++++++++++++------------------
11
tests/test-bdrv-drain.c | 69 ++++++++++++++++++++++++++++++++++++++++++++-----
18
1 file changed, 22 insertions(+), 18 deletions(-)
12
1 file changed, 62 insertions(+), 7 deletions(-)
19
13
20
diff --git a/block/rbd.c b/block/rbd.c
14
diff --git a/tests/test-bdrv-drain.c b/tests/test-bdrv-drain.c
21
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
22
--- a/block/rbd.c
16
--- a/tests/test-bdrv-drain.c
23
+++ b/block/rbd.c
17
+++ b/tests/test-bdrv-drain.c
24
@@ -XXX,XX +XXX,XX @@ done:
18
@@ -XXX,XX +XXX,XX @@ static BlockDriver bdrv_test = {
19
20
.bdrv_co_drain_begin = bdrv_test_co_drain_begin,
21
.bdrv_co_drain_end = bdrv_test_co_drain_end,
22
+
23
+ .bdrv_child_perm = bdrv_format_default_perms,
24
};
25
26
static void aio_ret_cb(void *opaque, int ret)
27
@@ -XXX,XX +XXX,XX @@ static void aio_ret_cb(void *opaque, int ret)
28
*aio_ret = ret;
25
}
29
}
26
30
27
31
-static void test_drv_cb_drain_all(void)
28
+static void qemu_rbd_refresh_limits(BlockDriverState *bs, Error **errp)
32
+enum drain_type {
33
+ BDRV_DRAIN_ALL,
34
+ BDRV_DRAIN,
35
+};
36
+
37
+static void do_drain_begin(enum drain_type drain_type, BlockDriverState *bs)
29
+{
38
+{
30
+ /* XXX Does RBD support AIO on less than 512-byte alignment? */
39
+ switch (drain_type) {
31
+ bs->bl.request_alignment = 512;
40
+ case BDRV_DRAIN_ALL: bdrv_drain_all_begin(); break;
41
+ case BDRV_DRAIN: bdrv_drained_begin(bs); break;
42
+ default: g_assert_not_reached();
43
+ }
32
+}
44
+}
33
+
45
+
46
+static void do_drain_end(enum drain_type drain_type, BlockDriverState *bs)
47
+{
48
+ switch (drain_type) {
49
+ case BDRV_DRAIN_ALL: bdrv_drain_all_end(); break;
50
+ case BDRV_DRAIN: bdrv_drained_end(bs); break;
51
+ default: g_assert_not_reached();
52
+ }
53
+}
34
+
54
+
35
static int qemu_rbd_set_auth(rados_t cluster, const char *secretid,
55
+static void test_drv_cb_common(enum drain_type drain_type, bool recursive)
36
Error **errp)
37
{
56
{
38
@@ -XXX,XX +XXX,XX @@ failed:
57
BlockBackend *blk;
39
return NULL;
58
- BlockDriverState *bs;
59
- BDRVTestState *s;
60
+ BlockDriverState *bs, *backing;
61
+ BDRVTestState *s, *backing_s;
62
BlockAIOCB *acb;
63
int aio_ret;
64
65
@@ -XXX,XX +XXX,XX @@ static void test_drv_cb_drain_all(void)
66
s = bs->opaque;
67
blk_insert_bs(blk, bs, &error_abort);
68
69
+ backing = bdrv_new_open_driver(&bdrv_test, "backing", 0, &error_abort);
70
+ backing_s = backing->opaque;
71
+ bdrv_set_backing_hd(bs, backing, &error_abort);
72
+
73
/* Simple bdrv_drain_all_begin/end pair, check that CBs are called */
74
g_assert_cmpint(s->drain_count, ==, 0);
75
- bdrv_drain_all_begin();
76
+ g_assert_cmpint(backing_s->drain_count, ==, 0);
77
+
78
+ do_drain_begin(drain_type, bs);
79
+
80
g_assert_cmpint(s->drain_count, ==, 1);
81
- bdrv_drain_all_end();
82
+ g_assert_cmpint(backing_s->drain_count, ==, !!recursive);
83
+
84
+ do_drain_end(drain_type, bs);
85
+
86
g_assert_cmpint(s->drain_count, ==, 0);
87
+ g_assert_cmpint(backing_s->drain_count, ==, 0);
88
89
/* Now do the same while a request is pending */
90
aio_ret = -EINPROGRESS;
91
@@ -XXX,XX +XXX,XX @@ static void test_drv_cb_drain_all(void)
92
g_assert_cmpint(aio_ret, ==, -EINPROGRESS);
93
94
g_assert_cmpint(s->drain_count, ==, 0);
95
- bdrv_drain_all_begin();
96
+ g_assert_cmpint(backing_s->drain_count, ==, 0);
97
+
98
+ do_drain_begin(drain_type, bs);
99
+
100
g_assert_cmpint(aio_ret, ==, 0);
101
g_assert_cmpint(s->drain_count, ==, 1);
102
- bdrv_drain_all_end();
103
+ g_assert_cmpint(backing_s->drain_count, ==, !!recursive);
104
+
105
+ do_drain_end(drain_type, bs);
106
+
107
g_assert_cmpint(s->drain_count, ==, 0);
108
+ g_assert_cmpint(backing_s->drain_count, ==, 0);
109
110
+ bdrv_unref(backing);
111
bdrv_unref(bs);
112
blk_unref(blk);
40
}
113
}
41
114
42
-static BlockAIOCB *qemu_rbd_aio_readv(BlockDriverState *bs,
115
+static void test_drv_cb_drain_all(void)
43
- int64_t sector_num,
116
+{
44
- QEMUIOVector *qiov,
117
+ test_drv_cb_common(BDRV_DRAIN_ALL, true);
45
- int nb_sectors,
118
+}
46
- BlockCompletionFunc *cb,
119
+
47
- void *opaque)
120
+static void test_drv_cb_drain(void)
48
+static BlockAIOCB *qemu_rbd_aio_preadv(BlockDriverState *bs,
121
+{
49
+ uint64_t offset, uint64_t bytes,
122
+ test_drv_cb_common(BDRV_DRAIN, false);
50
+ QEMUIOVector *qiov, int flags,
123
+}
51
+ BlockCompletionFunc *cb,
124
+
52
+ void *opaque)
125
int main(int argc, char **argv)
53
{
126
{
54
- return rbd_start_aio(bs, sector_num << BDRV_SECTOR_BITS, qiov,
127
bdrv_init();
55
- (int64_t) nb_sectors << BDRV_SECTOR_BITS, cb, opaque,
128
@@ -XXX,XX +XXX,XX @@ int main(int argc, char **argv)
56
+ return rbd_start_aio(bs, offset, qiov, bytes, cb, opaque,
129
g_test_init(&argc, &argv, NULL);
57
RBD_AIO_READ);
130
131
g_test_add_func("/bdrv-drain/driver-cb/drain_all", test_drv_cb_drain_all);
132
+ g_test_add_func("/bdrv-drain/driver-cb/drain", test_drv_cb_drain);
133
134
return g_test_run();
58
}
135
}
59
60
-static BlockAIOCB *qemu_rbd_aio_writev(BlockDriverState *bs,
61
- int64_t sector_num,
62
- QEMUIOVector *qiov,
63
- int nb_sectors,
64
- BlockCompletionFunc *cb,
65
- void *opaque)
66
+static BlockAIOCB *qemu_rbd_aio_pwritev(BlockDriverState *bs,
67
+ uint64_t offset, uint64_t bytes,
68
+ QEMUIOVector *qiov, int flags,
69
+ BlockCompletionFunc *cb,
70
+ void *opaque)
71
{
72
- return rbd_start_aio(bs, sector_num << BDRV_SECTOR_BITS, qiov,
73
- (int64_t) nb_sectors << BDRV_SECTOR_BITS, cb, opaque,
74
+ return rbd_start_aio(bs, offset, qiov, bytes, cb, opaque,
75
RBD_AIO_WRITE);
76
}
77
78
@@ -XXX,XX +XXX,XX @@ static BlockDriver bdrv_rbd = {
79
.format_name = "rbd",
80
.instance_size = sizeof(BDRVRBDState),
81
.bdrv_parse_filename = qemu_rbd_parse_filename,
82
+ .bdrv_refresh_limits = qemu_rbd_refresh_limits,
83
.bdrv_file_open = qemu_rbd_open,
84
.bdrv_close = qemu_rbd_close,
85
.bdrv_reopen_prepare = qemu_rbd_reopen_prepare,
86
@@ -XXX,XX +XXX,XX @@ static BlockDriver bdrv_rbd = {
87
.bdrv_truncate = qemu_rbd_truncate,
88
.protocol_name = "rbd",
89
90
- .bdrv_aio_readv = qemu_rbd_aio_readv,
91
- .bdrv_aio_writev = qemu_rbd_aio_writev,
92
+ .bdrv_aio_preadv = qemu_rbd_aio_preadv,
93
+ .bdrv_aio_pwritev = qemu_rbd_aio_pwritev,
94
95
#ifdef LIBRBD_SUPPORTS_AIO_FLUSH
96
.bdrv_aio_flush = qemu_rbd_aio_flush,
97
--
136
--
98
2.13.6
137
2.13.6
99
138
100
139
diff view generated by jsdifflib
1
From: Max Reitz <mreitz@redhat.com>
1
This is currently only working correctly for bdrv_drain(), not for
2
bdrv_drain_all(). Leave a comment for the drain_all case, we'll address
3
it later.
2
4
3
iotest 197 tests copy-on-read using the (now old) copy-on-read flag.
5
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
4
Copy it to 215 and modify it to use the COR filter driver instead.
6
---
7
tests/test-bdrv-drain.c | 45 +++++++++++++++++++++++++++++++++++++++++++++
8
1 file changed, 45 insertions(+)
5
9
6
Signed-off-by: Max Reitz <mreitz@redhat.com>
10
diff --git a/tests/test-bdrv-drain.c b/tests/test-bdrv-drain.c
7
Message-id: 20180421132929.21610-9-mreitz@redhat.com
11
index XXXXXXX..XXXXXXX 100644
8
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
12
--- a/tests/test-bdrv-drain.c
9
Signed-off-by: Max Reitz <mreitz@redhat.com>
13
+++ b/tests/test-bdrv-drain.c
10
---
14
@@ -XXX,XX +XXX,XX @@ static void test_drv_cb_drain(void)
11
tests/qemu-iotests/215 | 120 +++++++++++++++++++++++++++++++++++++++++++++
15
test_drv_cb_common(BDRV_DRAIN, false);
12
tests/qemu-iotests/215.out | 26 ++++++++++
16
}
13
tests/qemu-iotests/group | 1 +
17
14
3 files changed, 147 insertions(+)
18
+static void test_quiesce_common(enum drain_type drain_type, bool recursive)
15
create mode 100755 tests/qemu-iotests/215
19
+{
16
create mode 100644 tests/qemu-iotests/215.out
20
+ BlockBackend *blk;
17
21
+ BlockDriverState *bs, *backing;
18
diff --git a/tests/qemu-iotests/215 b/tests/qemu-iotests/215
19
new file mode 100755
20
index XXXXXXX..XXXXXXX
21
--- /dev/null
22
+++ b/tests/qemu-iotests/215
23
@@ -XXX,XX +XXX,XX @@
24
+#!/bin/bash
25
+#
26
+# Test case for copy-on-read into qcow2, using the COR filter driver
27
+#
28
+# Copyright (C) 2018 Red Hat, Inc.
29
+#
30
+# This program is free software; you can redistribute it and/or modify
31
+# it under the terms of the GNU General Public License as published by
32
+# the Free Software Foundation; either version 2 of the License, or
33
+# (at your option) any later version.
34
+#
35
+# This program is distributed in the hope that it will be useful,
36
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
37
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
38
+# GNU General Public License for more details.
39
+#
40
+# You should have received a copy of the GNU General Public License
41
+# along with this program. If not, see <http://www.gnu.org/licenses/>.
42
+#
43
+
22
+
44
+seq="$(basename $0)"
23
+ blk = blk_new(BLK_PERM_ALL, BLK_PERM_ALL);
45
+echo "QA output created by $seq"
24
+ bs = bdrv_new_open_driver(&bdrv_test, "test-node", BDRV_O_RDWR,
25
+ &error_abort);
26
+ blk_insert_bs(blk, bs, &error_abort);
46
+
27
+
47
+here="$PWD"
28
+ backing = bdrv_new_open_driver(&bdrv_test, "backing", 0, &error_abort);
48
+status=1 # failure is the default!
29
+ bdrv_set_backing_hd(bs, backing, &error_abort);
49
+
30
+
50
+# get standard environment, filters and checks
31
+ g_assert_cmpint(bs->quiesce_counter, ==, 0);
51
+. ./common.rc
32
+ g_assert_cmpint(backing->quiesce_counter, ==, 0);
52
+. ./common.filter
53
+
33
+
54
+TEST_WRAP="$TEST_DIR/t.wrap.qcow2"
34
+ do_drain_begin(drain_type, bs);
55
+BLKDBG_CONF="$TEST_DIR/blkdebug.conf"
56
+
35
+
57
+# Sanity check: our use of blkdebug fails if $TEST_DIR contains spaces
36
+ g_assert_cmpint(bs->quiesce_counter, ==, 1);
58
+# or other problems
37
+ g_assert_cmpint(backing->quiesce_counter, ==, !!recursive);
59
+case "$TEST_DIR" in
60
+ *[^-_a-zA-Z0-9/]*)
61
+ _notrun "Suspicious TEST_DIR='$TEST_DIR', cowardly refusing to run" ;;
62
+esac
63
+
38
+
64
+_cleanup()
39
+ do_drain_end(drain_type, bs);
40
+
41
+ g_assert_cmpint(bs->quiesce_counter, ==, 0);
42
+ g_assert_cmpint(backing->quiesce_counter, ==, 0);
43
+
44
+ bdrv_unref(backing);
45
+ bdrv_unref(bs);
46
+ blk_unref(blk);
47
+}
48
+
49
+static void test_quiesce_drain_all(void)
65
+{
50
+{
66
+ _cleanup_test_img
51
+ // XXX drain_all doesn't quiesce
67
+ rm -f "$TEST_WRAP"
52
+ //test_quiesce_common(BDRV_DRAIN_ALL, true);
68
+ rm -f "$BLKDBG_CONF"
69
+}
53
+}
70
+trap "_cleanup; exit \$status" 0 1 2 3 15
71
+
54
+
72
+# Test is supported for any backing file; but we force qcow2 for our wrapper.
55
+static void test_quiesce_drain(void)
73
+_supported_fmt generic
56
+{
74
+_supported_proto generic
57
+ test_quiesce_common(BDRV_DRAIN, false);
75
+_supported_os Linux
58
+}
76
+# LUKS support may be possible, but it complicates things.
77
+_unsupported_fmt luks
78
+
59
+
79
+echo
60
int main(int argc, char **argv)
80
+echo '=== Copy-on-read ==='
61
{
81
+echo
62
bdrv_init();
63
@@ -XXX,XX +XXX,XX @@ int main(int argc, char **argv)
64
g_test_add_func("/bdrv-drain/driver-cb/drain_all", test_drv_cb_drain_all);
65
g_test_add_func("/bdrv-drain/driver-cb/drain", test_drv_cb_drain);
66
67
+ g_test_add_func("/bdrv-drain/quiesce/drain_all", test_quiesce_drain_all);
68
+ g_test_add_func("/bdrv-drain/quiesce/drain", test_quiesce_drain);
82
+
69
+
83
+# Prep the images
70
return g_test_run();
84
+# VPC rounds image sizes to a specific geometry, force a specific size.
71
}
85
+if [ "$IMGFMT" = "vpc" ]; then
86
+ IMGOPTS=$(_optstr_add "$IMGOPTS" "force_size")
87
+fi
88
+_make_test_img 4G
89
+$QEMU_IO -c "write -P 55 3G 1k" "$TEST_IMG" | _filter_qemu_io
90
+IMGPROTO=file IMGFMT=qcow2 IMGOPTS= TEST_IMG_FILE="$TEST_WRAP" \
91
+ _make_test_img -F "$IMGFMT" -b "$TEST_IMG" | _filter_img_create
92
+$QEMU_IO -f qcow2 -c "write -z -u 1M 64k" "$TEST_WRAP" | _filter_qemu_io
93
+
94
+# Ensure that a read of two clusters, but where one is already allocated,
95
+# does not re-write the allocated cluster
96
+cat > "$BLKDBG_CONF" <<EOF
97
+[inject-error]
98
+event = "cor_write"
99
+sector = "2048"
100
+EOF
101
+$QEMU_IO -c "open \
102
+ -o driver=copy-on-read,file.driver=blkdebug,file.config=$BLKDBG_CONF,file.image.driver=qcow2 $TEST_WRAP" \
103
+ -c "read -P 0 1M 128k" | _filter_qemu_io
104
+
105
+# Read the areas we want copied. A zero-length read should still be a
106
+# no-op. The next read is under 2G, but aligned so that rounding to
107
+# clusters copies more than 2G of zeroes. The final read will pick up
108
+# the non-zero data in the same cluster. Since a 2G read may exhaust
109
+# memory on some machines (particularly 32-bit), we skip the test if
110
+# that fails due to memory pressure.
111
+$QEMU_IO \
112
+ -c "open -o driver=copy-on-read,file.driver=qcow2 $TEST_WRAP" \
113
+ -c "read 0 0" \
114
+ | _filter_qemu_io
115
+output=$($QEMU_IO \
116
+ -c "open -o driver=copy-on-read,file.driver=qcow2 $TEST_WRAP" \
117
+ -c "read -P 0 1k $((2*1024*1024*1024 - 512))" \
118
+ 2>&1 | _filter_qemu_io)
119
+case $output in
120
+ *allocate*)
121
+ _notrun "Insufficent memory to run test" ;;
122
+ *) printf '%s\n' "$output" ;;
123
+esac
124
+$QEMU_IO \
125
+ -c "open -o driver=copy-on-read,file.driver=qcow2 $TEST_WRAP" \
126
+ -c "read -P 0 $((3*1024*1024*1024 + 1024)) 1k" \
127
+ | _filter_qemu_io
128
+
129
+# Copy-on-read is incompatible with read-only
130
+$QEMU_IO \
131
+ -c "open -r -o driver=copy-on-read,file.driver=qcow2 $TEST_WRAP" \
132
+ 2>&1 | _filter_testdir
133
+
134
+# Break the backing chain, and show that images are identical, and that
135
+# we properly copied over explicit zeros.
136
+$QEMU_IMG rebase -u -b "" -f qcow2 "$TEST_WRAP"
137
+$QEMU_IO -f qcow2 -c map "$TEST_WRAP"
138
+_check_test_img
139
+$QEMU_IMG compare -f $IMGFMT -F qcow2 "$TEST_IMG" "$TEST_WRAP"
140
+
141
+# success, all done
142
+echo '*** done'
143
+status=0
144
diff --git a/tests/qemu-iotests/215.out b/tests/qemu-iotests/215.out
145
new file mode 100644
146
index XXXXXXX..XXXXXXX
147
--- /dev/null
148
+++ b/tests/qemu-iotests/215.out
149
@@ -XXX,XX +XXX,XX @@
150
+QA output created by 215
151
+
152
+=== Copy-on-read ===
153
+
154
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=4294967296
155
+wrote 1024/1024 bytes at offset 3221225472
156
+1 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
157
+Formatting 'TEST_DIR/t.wrap.IMGFMT', fmt=IMGFMT size=4294967296 backing_file=TEST_DIR/t.IMGFMT backing_fmt=IMGFMT
158
+wrote 65536/65536 bytes at offset 1048576
159
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
160
+read 131072/131072 bytes at offset 1048576
161
+128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
162
+read 0/0 bytes at offset 0
163
+0 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
164
+read 2147483136/2147483136 bytes at offset 1024
165
+2 GiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
166
+read 1024/1024 bytes at offset 3221226496
167
+1 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
168
+can't open device TEST_DIR/t.wrap.qcow2: Block node is read-only
169
+2 GiB (0x80010000) bytes allocated at offset 0 bytes (0x0)
170
+1023.938 MiB (0x3fff0000) bytes not allocated at offset 2 GiB (0x80010000)
171
+64 KiB (0x10000) bytes allocated at offset 3 GiB (0xc0000000)
172
+1023.938 MiB (0x3fff0000) bytes not allocated at offset 3 GiB (0xc0010000)
173
+No errors were found on the image.
174
+Images are identical.
175
+*** done
176
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
177
index XXXXXXX..XXXXXXX 100644
178
--- a/tests/qemu-iotests/group
179
+++ b/tests/qemu-iotests/group
180
@@ -XXX,XX +XXX,XX @@
181
212 rw auto quick
182
213 rw auto quick
183
214 rw auto
184
+215 rw auto quick
185
218 rw auto quick
186
--
72
--
187
2.13.6
73
2.13.6
188
74
189
75
diff view generated by jsdifflib
1
The backup block job directly accesses the driver field in BlockJob. Add
1
Block jobs already paused themselves when their main BlockBackend
2
a wrapper for getting it.
2
entered a drained section. This is not good enough: We also want to
3
pause a block job and may not submit new requests if, for example, the
4
mirror target node should be drained.
5
6
This implements .drained_begin/end callbacks in child_job in order to
7
consider all block nodes related to the job, and removes the
8
BlockBackend callbacks which are unnecessary now because the root of the
9
job main BlockBackend is always referenced with a child_job, too.
3
10
4
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
11
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
5
Reviewed-by: Eric Blake <eblake@redhat.com>
6
Reviewed-by: Max Reitz <mreitz@redhat.com>
7
Reviewed-by: John Snow <jsnow@redhat.com>
8
---
12
---
9
include/block/blockjob.h | 7 +++++++
13
blockjob.c | 22 +++++++++-------------
10
block/backup.c | 8 +++++---
14
1 file changed, 9 insertions(+), 13 deletions(-)
11
blockjob.c | 5 +++++
12
3 files changed, 17 insertions(+), 3 deletions(-)
13
15
14
diff --git a/include/block/blockjob.h b/include/block/blockjob.h
15
index XXXXXXX..XXXXXXX 100644
16
--- a/include/block/blockjob.h
17
+++ b/include/block/blockjob.h
18
@@ -XXX,XX +XXX,XX @@ void block_job_txn_add_job(BlockJobTxn *txn, BlockJob *job);
19
*/
20
bool block_job_is_internal(BlockJob *job);
21
22
+/**
23
+ * block_job_driver:
24
+ *
25
+ * Returns the driver associated with a block job.
26
+ */
27
+const BlockJobDriver *block_job_driver(BlockJob *job);
28
+
29
#endif
30
diff --git a/block/backup.c b/block/backup.c
31
index XXXXXXX..XXXXXXX 100644
32
--- a/block/backup.c
33
+++ b/block/backup.c
34
@@ -XXX,XX +XXX,XX @@ typedef struct BackupBlockJob {
35
HBitmap *copy_bitmap;
36
} BackupBlockJob;
37
38
+static const BlockJobDriver backup_job_driver;
39
+
40
/* See if in-flight requests overlap and wait for them to complete */
41
static void coroutine_fn wait_for_overlapping_requests(BackupBlockJob *job,
42
int64_t start,
43
@@ -XXX,XX +XXX,XX @@ void backup_do_checkpoint(BlockJob *job, Error **errp)
44
BackupBlockJob *backup_job = container_of(job, BackupBlockJob, common);
45
int64_t len;
46
47
- assert(job->driver->job_type == BLOCK_JOB_TYPE_BACKUP);
48
+ assert(block_job_driver(job) == &backup_job_driver);
49
50
if (backup_job->sync_mode != MIRROR_SYNC_MODE_NONE) {
51
error_setg(errp, "The backup job only supports block checkpoint in"
52
@@ -XXX,XX +XXX,XX @@ void backup_wait_for_overlapping_requests(BlockJob *job, int64_t offset,
53
BackupBlockJob *backup_job = container_of(job, BackupBlockJob, common);
54
int64_t start, end;
55
56
- assert(job->driver->job_type == BLOCK_JOB_TYPE_BACKUP);
57
+ assert(block_job_driver(job) == &backup_job_driver);
58
59
start = QEMU_ALIGN_DOWN(offset, backup_job->cluster_size);
60
end = QEMU_ALIGN_UP(offset + bytes, backup_job->cluster_size);
61
@@ -XXX,XX +XXX,XX @@ void backup_cow_request_begin(CowRequest *req, BlockJob *job,
62
BackupBlockJob *backup_job = container_of(job, BackupBlockJob, common);
63
int64_t start, end;
64
65
- assert(job->driver->job_type == BLOCK_JOB_TYPE_BACKUP);
66
+ assert(block_job_driver(job) == &backup_job_driver);
67
68
start = QEMU_ALIGN_DOWN(offset, backup_job->cluster_size);
69
end = QEMU_ALIGN_UP(offset + bytes, backup_job->cluster_size);
70
diff --git a/blockjob.c b/blockjob.c
16
diff --git a/blockjob.c b/blockjob.c
71
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
72
--- a/blockjob.c
18
--- a/blockjob.c
73
+++ b/blockjob.c
19
+++ b/blockjob.c
74
@@ -XXX,XX +XXX,XX @@ static bool block_job_started(BlockJob *job)
20
@@ -XXX,XX +XXX,XX @@ static char *child_job_get_parent_desc(BdrvChild *c)
75
return job->co;
21
job->id);
76
}
22
}
77
23
78
+const BlockJobDriver *block_job_driver(BlockJob *job)
24
-static const BdrvChildRole child_job = {
79
+{
25
- .get_parent_desc = child_job_get_parent_desc,
80
+ return job->driver;
26
- .stay_at_node = true,
81
+}
27
-};
82
+
28
-
83
/**
29
-static void block_job_drained_begin(void *opaque)
84
* All jobs must allow a pause point before entering their job proper. This
30
+static void child_job_drained_begin(BdrvChild *c)
85
* ensures that jobs can be paused prior to being started, then resumed later.
31
{
32
- BlockJob *job = opaque;
33
+ BlockJob *job = c->opaque;
34
block_job_pause(job);
35
}
36
37
-static void block_job_drained_end(void *opaque)
38
+static void child_job_drained_end(BdrvChild *c)
39
{
40
- BlockJob *job = opaque;
41
+ BlockJob *job = c->opaque;
42
block_job_resume(job);
43
}
44
45
-static const BlockDevOps block_job_dev_ops = {
46
- .drained_begin = block_job_drained_begin,
47
- .drained_end = block_job_drained_end,
48
+static const BdrvChildRole child_job = {
49
+ .get_parent_desc = child_job_get_parent_desc,
50
+ .drained_begin = child_job_drained_begin,
51
+ .drained_end = child_job_drained_end,
52
+ .stay_at_node = true,
53
};
54
55
void block_job_remove_all_bdrv(BlockJob *job)
56
@@ -XXX,XX +XXX,XX @@ void *block_job_create(const char *job_id, const BlockJobDriver *driver,
57
block_job_add_bdrv(job, "main node", bs, 0, BLK_PERM_ALL, &error_abort);
58
bs->job = job;
59
60
- blk_set_dev_ops(blk, &block_job_dev_ops, job);
61
bdrv_op_unblock(bs, BLOCK_OP_TYPE_DATAPLANE, job->blocker);
62
63
QLIST_INSERT_HEAD(&block_jobs, job, job_list);
86
--
64
--
87
2.13.6
65
2.13.6
88
66
89
67
diff view generated by jsdifflib
1
From: Max Reitz <mreitz@redhat.com>
1
Block jobs must be paused if any of the involved nodes are drained.
2
2
3
Some block drivers (iscsi and file-posix when dealing with device files)
3
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
4
do not actually support truncation, even though they provide a
4
---
5
.bdrv_truncate() method and will happily return success when providing a
5
tests/test-bdrv-drain.c | 121 ++++++++++++++++++++++++++++++++++++++++++++++++
6
new size that does not exceed the current size. This is because these
6
1 file changed, 121 insertions(+)
7
drivers expect the user to resize the image outside of qemu and then
8
provide qemu with that information through the block_resize command
9
(compare cb1b83e740384b4e0d950f3d7c81c02b8ce86c2e).
10
7
11
Of course, anyone using qemu-img resize will find that behavior useless.
8
diff --git a/tests/test-bdrv-drain.c b/tests/test-bdrv-drain.c
12
So we should check the actual size of the image after the supposedly
13
successful truncation took place, emit an error if nothing changed and
14
emit a warning if the target size was not met.
15
16
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1523065
17
Signed-off-by: Max Reitz <mreitz@redhat.com>
18
Message-id: 20180421163957.29872-1-mreitz@redhat.com
19
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
20
Signed-off-by: Max Reitz <mreitz@redhat.com>
21
---
22
qemu-img.c | 39 +++++++++++++++++++++++++++++++++++----
23
1 file changed, 35 insertions(+), 4 deletions(-)
24
25
diff --git a/qemu-img.c b/qemu-img.c
26
index XXXXXXX..XXXXXXX 100644
9
index XXXXXXX..XXXXXXX 100644
27
--- a/qemu-img.c
10
--- a/tests/test-bdrv-drain.c
28
+++ b/qemu-img.c
11
+++ b/tests/test-bdrv-drain.c
29
@@ -XXX,XX +XXX,XX @@ static int img_resize(int argc, char **argv)
12
@@ -XXX,XX +XXX,XX @@
30
Error *err = NULL;
13
31
int c, ret, relative;
14
#include "qemu/osdep.h"
32
const char *filename, *fmt, *size;
15
#include "block/block.h"
33
- int64_t n, total_size, current_size;
16
+#include "block/blockjob_int.h"
34
+ int64_t n, total_size, current_size, new_size;
17
#include "sysemu/block-backend.h"
35
bool quiet = false;
18
#include "qapi/error.h"
36
BlockBackend *blk = NULL;
19
37
PreallocMode prealloc = PREALLOC_MODE_OFF;
20
@@ -XXX,XX +XXX,XX @@ static void test_quiesce_drain(void)
38
@@ -XXX,XX +XXX,XX @@ static int img_resize(int argc, char **argv)
21
test_quiesce_common(BDRV_DRAIN, false);
39
}
22
}
40
23
41
ret = blk_truncate(blk, total_size, prealloc, &err);
24
+
42
- if (!ret) {
25
+typedef struct TestBlockJob {
43
- qprintf(quiet, "Image resized.\n");
26
+ BlockJob common;
44
- } else {
27
+ bool should_complete;
45
+ if (ret < 0) {
28
+} TestBlockJob;
46
error_report_err(err);
29
+
47
+ goto out;
30
+static void test_job_completed(BlockJob *job, void *opaque)
31
+{
32
+ block_job_completed(job, 0);
33
+}
34
+
35
+static void coroutine_fn test_job_start(void *opaque)
36
+{
37
+ TestBlockJob *s = opaque;
38
+
39
+ while (!s->should_complete) {
40
+ block_job_sleep_ns(&s->common, 100000);
48
+ }
41
+ }
49
+
42
+
50
+ new_size = blk_getlength(blk);
43
+ block_job_defer_to_main_loop(&s->common, test_job_completed, NULL);
51
+ if (new_size < 0) {
44
+}
52
+ error_report("Failed to verify truncated image length: %s",
53
+ strerror(-new_size));
54
+ ret = -1;
55
+ goto out;
56
}
57
+
45
+
58
+ /* Some block drivers implement a truncation method, but only so
46
+static void test_job_complete(BlockJob *job, Error **errp)
59
+ * the user can cause qemu to refresh the image's size from disk.
47
+{
60
+ * The idea is that the user resizes the image outside of qemu and
48
+ TestBlockJob *s = container_of(job, TestBlockJob, common);
61
+ * then invokes block_resize to inform qemu about it.
49
+ s->should_complete = true;
62
+ * (This includes iscsi and file-posix for device files.)
50
+}
63
+ * Of course, that is not the behavior someone invoking
51
+
64
+ * qemu-img resize would find useful, so we catch that behavior
52
+BlockJobDriver test_job_driver = {
65
+ * here and tell the user. */
53
+ .instance_size = sizeof(TestBlockJob),
66
+ if (new_size != total_size && new_size == current_size) {
54
+ .start = test_job_start,
67
+ error_report("Image was not resized; resizing may not be supported "
55
+ .complete = test_job_complete,
68
+ "for this image");
56
+};
69
+ ret = -1;
57
+
70
+ goto out;
58
+static void test_blockjob_common(enum drain_type drain_type)
59
+{
60
+ BlockBackend *blk_src, *blk_target;
61
+ BlockDriverState *src, *target;
62
+ BlockJob *job;
63
+ int ret;
64
+
65
+ src = bdrv_new_open_driver(&bdrv_test, "source", BDRV_O_RDWR,
66
+ &error_abort);
67
+ blk_src = blk_new(BLK_PERM_ALL, BLK_PERM_ALL);
68
+ blk_insert_bs(blk_src, src, &error_abort);
69
+
70
+ target = bdrv_new_open_driver(&bdrv_test, "target", BDRV_O_RDWR,
71
+ &error_abort);
72
+ blk_target = blk_new(BLK_PERM_ALL, BLK_PERM_ALL);
73
+ blk_insert_bs(blk_target, target, &error_abort);
74
+
75
+ job = block_job_create("job0", &test_job_driver, src, 0, BLK_PERM_ALL, 0,
76
+ 0, NULL, NULL, &error_abort);
77
+ block_job_add_bdrv(job, "target", target, 0, BLK_PERM_ALL, &error_abort);
78
+ block_job_start(job);
79
+
80
+ g_assert_cmpint(job->pause_count, ==, 0);
81
+ g_assert_false(job->paused);
82
+ g_assert_false(job->busy); /* We're in block_job_sleep_ns() */
83
+
84
+ do_drain_begin(drain_type, src);
85
+
86
+ if (drain_type == BDRV_DRAIN_ALL) {
87
+ /* bdrv_drain_all() drains both src and target, and involves an
88
+ * additional block_job_pause_all() */
89
+ g_assert_cmpint(job->pause_count, ==, 3);
90
+ } else {
91
+ g_assert_cmpint(job->pause_count, ==, 1);
71
+ }
92
+ }
93
+ /* XXX We don't wait until the job is actually paused. Is this okay? */
94
+ /* g_assert_true(job->paused); */
95
+ g_assert_false(job->busy); /* The job is paused */
72
+
96
+
73
+ if (new_size != total_size) {
97
+ do_drain_end(drain_type, src);
74
+ warn_report("Image should have been resized to %" PRIi64
98
+
75
+ " bytes, but was resized to %" PRIi64 " bytes",
99
+ g_assert_cmpint(job->pause_count, ==, 0);
76
+ total_size, new_size);
100
+ g_assert_false(job->paused);
101
+ g_assert_false(job->busy); /* We're in block_job_sleep_ns() */
102
+
103
+ do_drain_begin(drain_type, target);
104
+
105
+ if (drain_type == BDRV_DRAIN_ALL) {
106
+ /* bdrv_drain_all() drains both src and target, and involves an
107
+ * additional block_job_pause_all() */
108
+ g_assert_cmpint(job->pause_count, ==, 3);
109
+ } else {
110
+ g_assert_cmpint(job->pause_count, ==, 1);
77
+ }
111
+ }
112
+ /* XXX We don't wait until the job is actually paused. Is this okay? */
113
+ /* g_assert_true(job->paused); */
114
+ g_assert_false(job->busy); /* The job is paused */
78
+
115
+
79
+ qprintf(quiet, "Image resized.\n");
116
+ do_drain_end(drain_type, target);
80
+
117
+
81
out:
118
+ g_assert_cmpint(job->pause_count, ==, 0);
82
blk_unref(blk);
119
+ g_assert_false(job->paused);
83
if (ret) {
120
+ g_assert_false(job->busy); /* We're in block_job_sleep_ns() */
121
+
122
+ ret = block_job_complete_sync(job, &error_abort);
123
+ g_assert_cmpint(ret, ==, 0);
124
+
125
+ blk_unref(blk_src);
126
+ blk_unref(blk_target);
127
+ bdrv_unref(src);
128
+ bdrv_unref(target);
129
+}
130
+
131
+static void test_blockjob_drain_all(void)
132
+{
133
+ test_blockjob_common(BDRV_DRAIN_ALL);
134
+}
135
+
136
+static void test_blockjob_drain(void)
137
+{
138
+ test_blockjob_common(BDRV_DRAIN);
139
+}
140
+
141
int main(int argc, char **argv)
142
{
143
bdrv_init();
144
@@ -XXX,XX +XXX,XX @@ int main(int argc, char **argv)
145
g_test_add_func("/bdrv-drain/quiesce/drain_all", test_quiesce_drain_all);
146
g_test_add_func("/bdrv-drain/quiesce/drain", test_quiesce_drain);
147
148
+ g_test_add_func("/bdrv-drain/blockjob/drain_all", test_blockjob_drain_all);
149
+ g_test_add_func("/bdrv-drain/blockjob/drain", test_blockjob_drain);
150
+
151
return g_test_run();
152
}
84
--
153
--
85
2.13.6
154
2.13.6
86
155
87
156
diff view generated by jsdifflib
1
From: Max Reitz <mreitz@redhat.com>
1
Block jobs are already paused using the BdrvChildRole drain callbacks,
2
so we don't need an additional block_job_pause_all() call.
2
3
3
Currently, common.qemu only allows to match for results indicating
4
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
4
success. The only way to fail is by provoking a timeout. However,
5
---
5
sometimes we do have a defined failure output and can match for that,
6
block/io.c | 4 ----
6
which saves us from having to wait for the timeout in case of failure.
7
tests/test-bdrv-drain.c | 10 ++++------
7
Because failure can sometimes just result in a _notrun in the test, it
8
2 files changed, 4 insertions(+), 10 deletions(-)
8
is actually important to care about being able to fail quickly.
9
9
10
Also, sometimes we simply do not get any specific output in case of
10
diff --git a/block/io.c b/block/io.c
11
success. The only way to handle this currently would be to define an
12
error message as the string to look for, which means that actual success
13
results in a timeout. This is really bad because it unnecessarily slows
14
down a succeeding test.
15
16
Therefore, this patch adds a new parameter $success_or_failure to
17
_timed_wait_for and _send_qemu_cmd. Setting this to a non-empty string
18
makes both commands expect two match parameters: If the first matches,
19
the function succeeds. If the second matches, the function fails.
20
21
Signed-off-by: Max Reitz <mreitz@redhat.com>
22
Message-id: 20180406151731.4285-2-mreitz@redhat.com
23
Signed-off-by: Max Reitz <mreitz@redhat.com>
24
---
25
tests/qemu-iotests/common.qemu | 58 +++++++++++++++++++++++++++++++++++++-----
26
1 file changed, 51 insertions(+), 7 deletions(-)
27
28
diff --git a/tests/qemu-iotests/common.qemu b/tests/qemu-iotests/common.qemu
29
index XXXXXXX..XXXXXXX 100644
11
index XXXXXXX..XXXXXXX 100644
30
--- a/tests/qemu-iotests/common.qemu
12
--- a/block/io.c
31
+++ b/tests/qemu-iotests/common.qemu
13
+++ b/block/io.c
32
@@ -XXX,XX +XXX,XX @@ _in_fd=4
14
@@ -XXX,XX +XXX,XX @@ void bdrv_drain_all_begin(void)
33
# response is not echoed out.
15
* context. */
34
# If $mismatch_only is set, only non-matching responses will
16
assert(qemu_get_current_aio_context() == qemu_get_aio_context());
35
# be echoed.
17
36
+#
18
- block_job_pause_all();
37
+# If $success_or_failure is set, the meaning of the arguments is
19
-
38
+# changed as follows:
20
for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
39
+# $2: A string to search for in the response; if found, this indicates
21
AioContext *aio_context = bdrv_get_aio_context(bs);
40
+# success and ${QEMU_STATUS[$1]} is set to 0.
22
41
+# $3: A string to search for in the response; if found, this indicates
23
@@ -XXX,XX +XXX,XX @@ void bdrv_drain_all_end(void)
42
+# failure and the test is either aborted (if $qemu_error_no_exit
24
aio_enable_external(aio_context);
43
+# is not set) or ${QEMU_STATUS[$1]} is set to -1 (otherwise).
25
aio_context_release(aio_context);
44
function _timed_wait_for()
26
}
45
{
27
-
46
local h=${1}
28
- block_job_resume_all();
47
shift
48
49
+ if [ -z "${success_or_failure}" ]; then
50
+ success_match=${*}
51
+ failure_match=
52
+ else
53
+ success_match=${1}
54
+ failure_match=${2}
55
+ fi
56
+
57
+ timeout=yes
58
+
59
QEMU_STATUS[$h]=0
60
while IFS= read -t ${QEMU_COMM_TIMEOUT} resp <&${QEMU_OUT[$h]}
61
do
62
@@ -XXX,XX +XXX,XX @@ function _timed_wait_for()
63
echo "${resp}" | _filter_testdir | _filter_qemu \
64
| _filter_qemu_io | _filter_qmp | _filter_hmp
65
fi
66
- grep -q "${*}" < <(echo "${resp}")
67
+ if [ -n "${failure_match}" ]; then
68
+ grep -q "${failure_match}" < <(echo "${resp}")
69
+ if [ $? -eq 0 ]; then
70
+ timeout=
71
+ break
72
+ fi
73
+ fi
74
+ grep -q "${success_match}" < <(echo "${resp}")
75
if [ $? -eq 0 ]; then
76
return
77
- elif [ -z "${silent}" ] && [ -n "${mismatch_only}" ]; then
78
+ fi
79
+ if [ -z "${silent}" ] && [ -n "${mismatch_only}" ]; then
80
echo "${resp}" | _filter_testdir | _filter_qemu \
81
| _filter_qemu_io | _filter_qmp | _filter_hmp
82
fi
83
@@ -XXX,XX +XXX,XX @@ function _timed_wait_for()
84
done
85
QEMU_STATUS[$h]=-1
86
if [ -z "${qemu_error_no_exit}" ]; then
87
- echo "Timeout waiting for ${*} on handle ${h}"
88
- exit 1 # Timeout means the test failed
89
+ if [ -n "${timeout}" ]; then
90
+ echo "Timeout waiting for ${success_match} on handle ${h}"
91
+ else
92
+ echo "Wrong response matching ${failure_match} on handle ${h}"
93
+ fi
94
+ exit 1 # Timeout or wrong match mean the test failed
95
fi
96
}
29
}
97
30
98
@@ -XXX,XX +XXX,XX @@ function _timed_wait_for()
31
void bdrv_drain_all(void)
99
# If $qemu_error_no_exit is set, then even if the expected response
32
diff --git a/tests/test-bdrv-drain.c b/tests/test-bdrv-drain.c
100
# is not seen, we will not exit. $QEMU_STATUS[$1] will be set it -1 in
33
index XXXXXXX..XXXXXXX 100644
101
# that case.
34
--- a/tests/test-bdrv-drain.c
102
+#
35
+++ b/tests/test-bdrv-drain.c
103
+# If $success_or_failure is set, then the last two strings are the
36
@@ -XXX,XX +XXX,XX @@ static void test_blockjob_common(enum drain_type drain_type)
104
+# strings the response will be scanned for. The first of the two
37
do_drain_begin(drain_type, src);
105
+# indicates success, the latter indicates failure. Failure is handled
38
106
+# like a timeout.
39
if (drain_type == BDRV_DRAIN_ALL) {
107
function _send_qemu_cmd()
40
- /* bdrv_drain_all() drains both src and target, and involves an
108
{
41
- * additional block_job_pause_all() */
109
local h=${1}
42
- g_assert_cmpint(job->pause_count, ==, 3);
110
@@ -XXX,XX +XXX,XX @@ function _send_qemu_cmd()
43
+ /* bdrv_drain_all() drains both src and target */
111
use_error="no"
44
+ g_assert_cmpint(job->pause_count, ==, 2);
112
fi
45
} else {
113
# This array element extraction is done to accommodate pathnames with spaces
46
g_assert_cmpint(job->pause_count, ==, 1);
114
- cmd=${@: 1:${#@}-1}
47
}
115
- shift $(($# - 1))
48
@@ -XXX,XX +XXX,XX @@ static void test_blockjob_common(enum drain_type drain_type)
116
+ if [ -z "${success_or_failure}" ]; then
49
do_drain_begin(drain_type, target);
117
+ cmd=${@: 1:${#@}-1}
50
118
+ shift $(($# - 1))
51
if (drain_type == BDRV_DRAIN_ALL) {
119
+ else
52
- /* bdrv_drain_all() drains both src and target, and involves an
120
+ cmd=${@: 1:${#@}-2}
53
- * additional block_job_pause_all() */
121
+ shift $(($# - 2))
54
- g_assert_cmpint(job->pause_count, ==, 3);
122
+ fi
55
+ /* bdrv_drain_all() drains both src and target */
123
56
+ g_assert_cmpint(job->pause_count, ==, 2);
124
while [ ${count} -gt 0 ]
57
} else {
125
do
58
g_assert_cmpint(job->pause_count, ==, 1);
126
echo "${cmd}" >&${QEMU_IN[${h}]}
59
}
127
if [ -n "${1}" ]; then
128
- qemu_error_no_exit=${use_error} _timed_wait_for ${h} "${1}"
129
+ if [ -z "${success_or_failure}" ]; then
130
+ qemu_error_no_exit=${use_error} _timed_wait_for ${h} "${1}"
131
+ else
132
+ qemu_error_no_exit=${use_error} _timed_wait_for ${h} "${1}" "${2}"
133
+ fi
134
if [ ${QEMU_STATUS[$h]} -eq 0 ]; then
135
return
136
fi
137
--
60
--
138
2.13.6
61
2.13.6
139
62
140
63
diff view generated by jsdifflib
1
Every block job has a RateLimit, and they all do the exact same thing
1
bdrv_do_drained_begin() restricts the call of parent callbacks and
2
with it, so it should be common infrastructure. Move the struct field
2
aio_disable_external() to the outermost drain section, but the block
3
for a start.
3
driver callbacks are always called. bdrv_do_drained_end() must match
4
this behaviour, otherwise nodes stay drained even if begin/end calls
5
were balanced.
4
6
5
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
7
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
6
Reviewed-by: Eric Blake <eblake@redhat.com>
7
Reviewed-by: Max Reitz <mreitz@redhat.com>
8
Reviewed-by: John Snow <jsnow@redhat.com>
9
---
8
---
10
include/block/blockjob.h | 4 ++++
9
block/io.c | 12 +++++++-----
11
block/backup.c | 5 ++---
10
1 file changed, 7 insertions(+), 5 deletions(-)
12
block/commit.c | 5 ++---
13
block/mirror.c | 6 +++---
14
block/stream.c | 5 ++---
15
5 files changed, 13 insertions(+), 12 deletions(-)
16
11
17
diff --git a/include/block/blockjob.h b/include/block/blockjob.h
12
diff --git a/block/io.c b/block/io.c
18
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
19
--- a/include/block/blockjob.h
14
--- a/block/io.c
20
+++ b/include/block/blockjob.h
15
+++ b/block/io.c
21
@@ -XXX,XX +XXX,XX @@
16
@@ -XXX,XX +XXX,XX @@ void bdrv_drained_begin(BlockDriverState *bs)
22
#define BLOCKJOB_H
17
23
18
void bdrv_drained_end(BlockDriverState *bs)
24
#include "block/block.h"
19
{
25
+#include "qemu/ratelimit.h"
20
+ int old_quiesce_counter;
26
27
typedef struct BlockJobDriver BlockJobDriver;
28
typedef struct BlockJobTxn BlockJobTxn;
29
@@ -XXX,XX +XXX,XX @@ typedef struct BlockJob {
30
/** Speed that was set with @block_job_set_speed. */
31
int64_t speed;
32
33
+ /** Rate limiting data structure for implementing @speed. */
34
+ RateLimit limit;
35
+
21
+
36
/** The completion function that will be called when the job completes. */
22
if (qemu_in_coroutine()) {
37
BlockCompletionFunc *cb;
23
bdrv_co_yield_to_drain(bs, false);
38
39
diff --git a/block/backup.c b/block/backup.c
40
index XXXXXXX..XXXXXXX 100644
41
--- a/block/backup.c
42
+++ b/block/backup.c
43
@@ -XXX,XX +XXX,XX @@ typedef struct BackupBlockJob {
44
/* bitmap for sync=incremental */
45
BdrvDirtyBitmap *sync_bitmap;
46
MirrorSyncMode sync_mode;
47
- RateLimit limit;
48
BlockdevOnError on_source_error;
49
BlockdevOnError on_target_error;
50
CoRwlock flush_rwlock;
51
@@ -XXX,XX +XXX,XX @@ static void backup_set_speed(BlockJob *job, int64_t speed, Error **errp)
52
error_setg(errp, QERR_INVALID_PARAMETER, "speed");
53
return;
24
return;
54
}
25
}
55
- ratelimit_set_speed(&s->limit, speed, SLICE_TIME);
26
assert(bs->quiesce_counter > 0);
56
+ ratelimit_set_speed(&s->common.limit, speed, SLICE_TIME);
27
- if (atomic_fetch_dec(&bs->quiesce_counter) > 1) {
28
- return;
29
- }
30
+ old_quiesce_counter = atomic_fetch_dec(&bs->quiesce_counter);
31
32
/* Re-enable things in child-to-parent order */
33
bdrv_drain_invoke(bs, false, false);
34
- bdrv_parent_drained_end(bs);
35
- aio_enable_external(bdrv_get_aio_context(bs));
36
+ if (old_quiesce_counter == 1) {
37
+ bdrv_parent_drained_end(bs);
38
+ aio_enable_external(bdrv_get_aio_context(bs));
39
+ }
57
}
40
}
58
41
59
static void backup_cleanup_sync_bitmap(BackupBlockJob *job, int ret)
42
/*
60
@@ -XXX,XX +XXX,XX @@ static bool coroutine_fn yield_and_check(BackupBlockJob *job)
61
* (without, VM does not reboot)
62
*/
63
if (job->common.speed) {
64
- uint64_t delay_ns = ratelimit_calculate_delay(&job->limit,
65
+ uint64_t delay_ns = ratelimit_calculate_delay(&job->common.limit,
66
job->bytes_read);
67
job->bytes_read = 0;
68
block_job_sleep_ns(&job->common, delay_ns);
69
diff --git a/block/commit.c b/block/commit.c
70
index XXXXXXX..XXXXXXX 100644
71
--- a/block/commit.c
72
+++ b/block/commit.c
73
@@ -XXX,XX +XXX,XX @@ enum {
74
75
typedef struct CommitBlockJob {
76
BlockJob common;
77
- RateLimit limit;
78
BlockDriverState *commit_top_bs;
79
BlockBackend *top;
80
BlockBackend *base;
81
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn commit_run(void *opaque)
82
block_job_progress_update(&s->common, n);
83
84
if (copy && s->common.speed) {
85
- delay_ns = ratelimit_calculate_delay(&s->limit, n);
86
+ delay_ns = ratelimit_calculate_delay(&s->common.limit, n);
87
} else {
88
delay_ns = 0;
89
}
90
@@ -XXX,XX +XXX,XX @@ static void commit_set_speed(BlockJob *job, int64_t speed, Error **errp)
91
error_setg(errp, QERR_INVALID_PARAMETER, "speed");
92
return;
93
}
94
- ratelimit_set_speed(&s->limit, speed, SLICE_TIME);
95
+ ratelimit_set_speed(&s->common.limit, speed, SLICE_TIME);
96
}
97
98
static const BlockJobDriver commit_job_driver = {
99
diff --git a/block/mirror.c b/block/mirror.c
100
index XXXXXXX..XXXXXXX 100644
101
--- a/block/mirror.c
102
+++ b/block/mirror.c
103
@@ -XXX,XX +XXX,XX @@ typedef struct MirrorBuffer {
104
105
typedef struct MirrorBlockJob {
106
BlockJob common;
107
- RateLimit limit;
108
BlockBackend *target;
109
BlockDriverState *mirror_top_bs;
110
BlockDriverState *source;
111
@@ -XXX,XX +XXX,XX @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
112
offset += io_bytes;
113
nb_chunks -= DIV_ROUND_UP(io_bytes, s->granularity);
114
if (s->common.speed) {
115
- delay_ns = ratelimit_calculate_delay(&s->limit, io_bytes_acct);
116
+ delay_ns = ratelimit_calculate_delay(&s->common.limit,
117
+ io_bytes_acct);
118
}
119
}
120
return delay_ns;
121
@@ -XXX,XX +XXX,XX @@ static void mirror_set_speed(BlockJob *job, int64_t speed, Error **errp)
122
error_setg(errp, QERR_INVALID_PARAMETER, "speed");
123
return;
124
}
125
- ratelimit_set_speed(&s->limit, speed, SLICE_TIME);
126
+ ratelimit_set_speed(&s->common.limit, speed, SLICE_TIME);
127
}
128
129
static void mirror_complete(BlockJob *job, Error **errp)
130
diff --git a/block/stream.c b/block/stream.c
131
index XXXXXXX..XXXXXXX 100644
132
--- a/block/stream.c
133
+++ b/block/stream.c
134
@@ -XXX,XX +XXX,XX @@ enum {
135
136
typedef struct StreamBlockJob {
137
BlockJob common;
138
- RateLimit limit;
139
BlockDriverState *base;
140
BlockdevOnError on_error;
141
char *backing_file_str;
142
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn stream_run(void *opaque)
143
/* Publish progress */
144
block_job_progress_update(&s->common, n);
145
if (copy && s->common.speed) {
146
- delay_ns = ratelimit_calculate_delay(&s->limit, n);
147
+ delay_ns = ratelimit_calculate_delay(&s->common.limit, n);
148
} else {
149
delay_ns = 0;
150
}
151
@@ -XXX,XX +XXX,XX @@ static void stream_set_speed(BlockJob *job, int64_t speed, Error **errp)
152
error_setg(errp, QERR_INVALID_PARAMETER, "speed");
153
return;
154
}
155
- ratelimit_set_speed(&s->limit, speed, SLICE_TIME);
156
+ ratelimit_set_speed(&s->common.limit, speed, SLICE_TIME);
157
}
158
159
static const BlockJobDriver stream_job_driver = {
160
--
43
--
161
2.13.6
44
2.13.6
162
45
163
46
diff view generated by jsdifflib
1
From: Max Reitz <mreitz@redhat.com>
1
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2
---
3
tests/test-bdrv-drain.c | 57 +++++++++++++++++++++++++++++++++++++++++++++++++
4
1 file changed, 57 insertions(+)
2
5
3
userfaultfd support depends on the host kernel, so it may not be
6
diff --git a/tests/test-bdrv-drain.c b/tests/test-bdrv-drain.c
4
available. If so, 181 and 201 should be skipped.
7
index XXXXXXX..XXXXXXX 100644
5
8
--- a/tests/test-bdrv-drain.c
6
Signed-off-by: Max Reitz <mreitz@redhat.com>
9
+++ b/tests/test-bdrv-drain.c
7
Message-id: 20180406151731.4285-3-mreitz@redhat.com
10
@@ -XXX,XX +XXX,XX @@ static void aio_ret_cb(void *opaque, int ret)
8
Signed-off-by: Max Reitz <mreitz@redhat.com>
11
enum drain_type {
9
---
12
BDRV_DRAIN_ALL,
10
tests/qemu-iotests/181 | 13 +++++++++++++
13
BDRV_DRAIN,
11
tests/qemu-iotests/201 | 13 +++++++++++++
14
+ DRAIN_TYPE_MAX,
12
2 files changed, 26 insertions(+)
15
};
13
16
14
diff --git a/tests/qemu-iotests/181 b/tests/qemu-iotests/181
17
static void do_drain_begin(enum drain_type drain_type, BlockDriverState *bs)
15
index XXXXXXX..XXXXXXX 100755
18
@@ -XXX,XX +XXX,XX @@ static void test_quiesce_drain(void)
16
--- a/tests/qemu-iotests/181
19
test_quiesce_common(BDRV_DRAIN, false);
17
+++ b/tests/qemu-iotests/181
20
}
18
@@ -XXX,XX +XXX,XX @@ echo
21
19
# Enable postcopy-ram capability both on source and destination
22
+static void test_nested(void)
20
silent=yes
23
+{
21
_send_qemu_cmd $dest 'migrate_set_capability postcopy-ram on' "(qemu)"
24
+ BlockBackend *blk;
25
+ BlockDriverState *bs, *backing;
26
+ BDRVTestState *s, *backing_s;
27
+ enum drain_type outer, inner;
22
+
28
+
23
+qemu_error_no_exit=yes success_or_failure=yes \
29
+ blk = blk_new(BLK_PERM_ALL, BLK_PERM_ALL);
24
+ _send_qemu_cmd $dest '' "(qemu)" "Postcopy is not supported"
30
+ bs = bdrv_new_open_driver(&bdrv_test, "test-node", BDRV_O_RDWR,
25
+if [ ${QEMU_STATUS[$dest]} -lt 0 ]; then
31
+ &error_abort);
26
+ _send_qemu_cmd $dest '' "(qemu)"
32
+ s = bs->opaque;
33
+ blk_insert_bs(blk, bs, &error_abort);
27
+
34
+
28
+ _send_qemu_cmd $src 'quit' ""
35
+ backing = bdrv_new_open_driver(&bdrv_test, "backing", 0, &error_abort);
29
+ _send_qemu_cmd $dest 'quit' ""
36
+ backing_s = backing->opaque;
30
+ wait=1 _cleanup_qemu
37
+ bdrv_set_backing_hd(bs, backing, &error_abort);
31
+
38
+
32
+ _notrun 'Postcopy is not supported'
39
+ for (outer = 0; outer < DRAIN_TYPE_MAX; outer++) {
33
+fi
40
+ for (inner = 0; inner < DRAIN_TYPE_MAX; inner++) {
41
+ /* XXX bdrv_drain_all() doesn't increase the quiesce_counter */
42
+ int bs_quiesce = (outer != BDRV_DRAIN_ALL) +
43
+ (inner != BDRV_DRAIN_ALL);
44
+ int backing_quiesce = 0;
45
+ int backing_cb_cnt = (outer != BDRV_DRAIN) +
46
+ (inner != BDRV_DRAIN);
34
+
47
+
35
_send_qemu_cmd $src 'migrate_set_speed 4k' "(qemu)"
48
+ g_assert_cmpint(bs->quiesce_counter, ==, 0);
36
_send_qemu_cmd $src 'migrate_set_capability postcopy-ram on' "(qemu)"
49
+ g_assert_cmpint(backing->quiesce_counter, ==, 0);
37
_send_qemu_cmd $src "migrate -d unix:${MIG_SOCKET}" "(qemu)"
50
+ g_assert_cmpint(s->drain_count, ==, 0);
38
diff --git a/tests/qemu-iotests/201 b/tests/qemu-iotests/201
51
+ g_assert_cmpint(backing_s->drain_count, ==, 0);
39
index XXXXXXX..XXXXXXX 100755
40
--- a/tests/qemu-iotests/201
41
+++ b/tests/qemu-iotests/201
42
@@ -XXX,XX +XXX,XX @@ echo
43
44
silent=yes
45
_send_qemu_cmd $dest 'migrate_set_capability postcopy-ram on' "(qemu)"
46
+
52
+
47
+qemu_error_no_exit=yes success_or_failure=yes \
53
+ do_drain_begin(outer, bs);
48
+ _send_qemu_cmd $dest '' "(qemu)" "Postcopy is not supported"
54
+ do_drain_begin(inner, bs);
49
+if [ ${QEMU_STATUS[$dest]} -lt 0 ]; then
50
+ _send_qemu_cmd $dest '' "(qemu)"
51
+
55
+
52
+ _send_qemu_cmd $src 'quit' ""
56
+ g_assert_cmpint(bs->quiesce_counter, ==, bs_quiesce);
53
+ _send_qemu_cmd $dest 'quit' ""
57
+ g_assert_cmpint(backing->quiesce_counter, ==, backing_quiesce);
54
+ wait=1 _cleanup_qemu
58
+ g_assert_cmpint(s->drain_count, ==, 2);
59
+ g_assert_cmpint(backing_s->drain_count, ==, backing_cb_cnt);
55
+
60
+
56
+ _notrun 'Postcopy is not supported'
61
+ do_drain_end(inner, bs);
57
+fi
62
+ do_drain_end(outer, bs);
58
+
63
+
59
_send_qemu_cmd $src 'migrate_set_capability postcopy-ram on' "(qemu)"
64
+ g_assert_cmpint(bs->quiesce_counter, ==, 0);
60
_send_qemu_cmd $src "migrate -d unix:${MIG_SOCKET}" "(qemu)"
65
+ g_assert_cmpint(backing->quiesce_counter, ==, 0);
66
+ g_assert_cmpint(s->drain_count, ==, 0);
67
+ g_assert_cmpint(backing_s->drain_count, ==, 0);
68
+ }
69
+ }
70
+
71
+ bdrv_unref(backing);
72
+ bdrv_unref(bs);
73
+ blk_unref(blk);
74
+}
75
+
76
77
typedef struct TestBlockJob {
78
BlockJob common;
79
@@ -XXX,XX +XXX,XX @@ int main(int argc, char **argv)
80
g_test_add_func("/bdrv-drain/quiesce/drain_all", test_quiesce_drain_all);
81
g_test_add_func("/bdrv-drain/quiesce/drain", test_quiesce_drain);
82
83
+ g_test_add_func("/bdrv-drain/nested", test_nested);
84
+
85
g_test_add_func("/bdrv-drain/blockjob/drain_all", test_blockjob_drain_all);
86
g_test_add_func("/bdrv-drain/blockjob/drain", test_blockjob_drain);
61
87
62
--
88
--
63
2.13.6
89
2.13.6
64
90
65
91
diff view generated by jsdifflib
1
From: Max Reitz <mreitz@redhat.com>
1
This is in preparation for subtree drains, i.e. drained sections that
2
2
affect not only a single node, but recursively all child nodes, too.
3
Currently we never actually check whether the WRITE_UNCHANGED
3
4
permission has been taken for unchanging writes. But the one check that
4
Calling the parent callbacks for drain is pointless when we just came
5
is commented out checks both WRITE and WRITE_UNCHANGED; and considering
5
from that parent node recursively and leads to multiple increases of
6
that WRITE_UNCHANGED is already documented as being weaker than WRITE,
6
bs->quiesce_counter in a single drain call. Don't do it.
7
we should probably explicitly document WRITE to include WRITE_UNCHANGED.
7
8
8
In order for this to work correctly, the parent callback must be called
9
Signed-off-by: Max Reitz <mreitz@redhat.com>
9
for every bdrv_drain_begin/end() call, not only for the outermost one:
10
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
10
11
Reviewed-by: Alberto Garcia <berto@igalia.com>
11
If we have a node N with two parents A and B, recursive draining of A
12
Message-id: 20180421132929.21610-3-mreitz@redhat.com
12
should cause the quiesce_counter of B to increase because its child N is
13
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
13
drained independently of B. If now B is recursively drained, too, A must
14
Signed-off-by: Max Reitz <mreitz@redhat.com>
14
increase its quiesce_counter because N is drained independently of A
15
only now, even if N is going from quiesce_counter 1 to 2.
16
17
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
15
---
18
---
16
include/block/block.h | 3 +++
19
include/block/block.h | 4 ++--
17
1 file changed, 3 insertions(+)
20
block.c | 13 +++++++++----
21
block/io.c | 47 ++++++++++++++++++++++++++++++++++-------------
22
3 files changed, 45 insertions(+), 19 deletions(-)
18
23
19
diff --git a/include/block/block.h b/include/block/block.h
24
diff --git a/include/block/block.h b/include/block/block.h
20
index XXXXXXX..XXXXXXX 100644
25
index XXXXXXX..XXXXXXX 100644
21
--- a/include/block/block.h
26
--- a/include/block/block.h
22
+++ b/include/block/block.h
27
+++ b/include/block/block.h
23
@@ -XXX,XX +XXX,XX @@ enum {
28
@@ -XXX,XX +XXX,XX @@ void bdrv_io_unplug(BlockDriverState *bs);
24
* This permission (which is weaker than BLK_PERM_WRITE) is both enough and
29
* Begin a quiesced section of all users of @bs. This is part of
25
* required for writes to the block node when the caller promises that
30
* bdrv_drained_begin.
26
* the visible disk content doesn't change.
31
*/
27
+ *
32
-void bdrv_parent_drained_begin(BlockDriverState *bs);
28
+ * As the BLK_PERM_WRITE permission is strictly stronger, either is
33
+void bdrv_parent_drained_begin(BlockDriverState *bs, BdrvChild *ignore);
29
+ * sufficient to perform an unchanging write.
34
35
/**
36
* bdrv_parent_drained_end:
37
@@ -XXX,XX +XXX,XX @@ void bdrv_parent_drained_begin(BlockDriverState *bs);
38
* End a quiesced section of all users of @bs. This is part of
39
* bdrv_drained_end.
40
*/
41
-void bdrv_parent_drained_end(BlockDriverState *bs);
42
+void bdrv_parent_drained_end(BlockDriverState *bs, BdrvChild *ignore);
43
44
/**
45
* bdrv_drained_begin:
46
diff --git a/block.c b/block.c
47
index XXXXXXX..XXXXXXX 100644
48
--- a/block.c
49
+++ b/block.c
50
@@ -XXX,XX +XXX,XX @@ static void bdrv_replace_child_noperm(BdrvChild *child,
51
BlockDriverState *new_bs)
52
{
53
BlockDriverState *old_bs = child->bs;
54
+ int i;
55
56
if (old_bs && new_bs) {
57
assert(bdrv_get_aio_context(old_bs) == bdrv_get_aio_context(new_bs));
58
}
59
if (old_bs) {
60
if (old_bs->quiesce_counter && child->role->drained_end) {
61
- child->role->drained_end(child);
62
+ for (i = 0; i < old_bs->quiesce_counter; i++) {
63
+ child->role->drained_end(child);
64
+ }
65
}
66
if (child->role->detach) {
67
child->role->detach(child);
68
@@ -XXX,XX +XXX,XX @@ static void bdrv_replace_child_noperm(BdrvChild *child,
69
if (new_bs) {
70
QLIST_INSERT_HEAD(&new_bs->parents, child, next_parent);
71
if (new_bs->quiesce_counter && child->role->drained_begin) {
72
- child->role->drained_begin(child);
73
+ for (i = 0; i < new_bs->quiesce_counter; i++) {
74
+ child->role->drained_begin(child);
75
+ }
76
}
77
78
if (child->role->attach) {
79
@@ -XXX,XX +XXX,XX @@ void bdrv_set_aio_context(BlockDriverState *bs, AioContext *new_context)
80
AioContext *ctx = bdrv_get_aio_context(bs);
81
82
aio_disable_external(ctx);
83
- bdrv_parent_drained_begin(bs);
84
+ bdrv_parent_drained_begin(bs, NULL);
85
bdrv_drain(bs); /* ensure there are no in-flight requests */
86
87
while (aio_poll(ctx, false)) {
88
@@ -XXX,XX +XXX,XX @@ void bdrv_set_aio_context(BlockDriverState *bs, AioContext *new_context)
30
*/
89
*/
31
BLK_PERM_WRITE_UNCHANGED = 0x04,
90
aio_context_acquire(new_context);
32
91
bdrv_attach_aio_context(bs, new_context);
92
- bdrv_parent_drained_end(bs);
93
+ bdrv_parent_drained_end(bs, NULL);
94
aio_enable_external(ctx);
95
aio_context_release(new_context);
96
}
97
diff --git a/block/io.c b/block/io.c
98
index XXXXXXX..XXXXXXX 100644
99
--- a/block/io.c
100
+++ b/block/io.c
101
@@ -XXX,XX +XXX,XX @@
102
static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
103
int64_t offset, int bytes, BdrvRequestFlags flags);
104
105
-void bdrv_parent_drained_begin(BlockDriverState *bs)
106
+void bdrv_parent_drained_begin(BlockDriverState *bs, BdrvChild *ignore)
107
{
108
BdrvChild *c, *next;
109
110
QLIST_FOREACH_SAFE(c, &bs->parents, next_parent, next) {
111
+ if (c == ignore) {
112
+ continue;
113
+ }
114
if (c->role->drained_begin) {
115
c->role->drained_begin(c);
116
}
117
}
118
}
119
120
-void bdrv_parent_drained_end(BlockDriverState *bs)
121
+void bdrv_parent_drained_end(BlockDriverState *bs, BdrvChild *ignore)
122
{
123
BdrvChild *c, *next;
124
125
QLIST_FOREACH_SAFE(c, &bs->parents, next_parent, next) {
126
+ if (c == ignore) {
127
+ continue;
128
+ }
129
if (c->role->drained_end) {
130
c->role->drained_end(c);
131
}
132
@@ -XXX,XX +XXX,XX @@ typedef struct {
133
BlockDriverState *bs;
134
bool done;
135
bool begin;
136
+ BdrvChild *parent;
137
} BdrvCoDrainData;
138
139
static void coroutine_fn bdrv_drain_invoke_entry(void *opaque)
140
@@ -XXX,XX +XXX,XX @@ static bool bdrv_drain_recurse(BlockDriverState *bs)
141
return waited;
142
}
143
144
+static void bdrv_do_drained_begin(BlockDriverState *bs, BdrvChild *parent);
145
+static void bdrv_do_drained_end(BlockDriverState *bs, BdrvChild *parent);
146
+
147
static void bdrv_co_drain_bh_cb(void *opaque)
148
{
149
BdrvCoDrainData *data = opaque;
150
@@ -XXX,XX +XXX,XX @@ static void bdrv_co_drain_bh_cb(void *opaque)
151
152
bdrv_dec_in_flight(bs);
153
if (data->begin) {
154
- bdrv_drained_begin(bs);
155
+ bdrv_do_drained_begin(bs, data->parent);
156
} else {
157
- bdrv_drained_end(bs);
158
+ bdrv_do_drained_end(bs, data->parent);
159
}
160
161
data->done = true;
162
@@ -XXX,XX +XXX,XX @@ static void bdrv_co_drain_bh_cb(void *opaque)
163
}
164
165
static void coroutine_fn bdrv_co_yield_to_drain(BlockDriverState *bs,
166
- bool begin)
167
+ bool begin, BdrvChild *parent)
168
{
169
BdrvCoDrainData data;
170
171
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn bdrv_co_yield_to_drain(BlockDriverState *bs,
172
.bs = bs,
173
.done = false,
174
.begin = begin,
175
+ .parent = parent,
176
};
177
bdrv_inc_in_flight(bs);
178
aio_bh_schedule_oneshot(bdrv_get_aio_context(bs),
179
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn bdrv_co_yield_to_drain(BlockDriverState *bs,
180
assert(data.done);
181
}
182
183
-void bdrv_drained_begin(BlockDriverState *bs)
184
+static void bdrv_do_drained_begin(BlockDriverState *bs, BdrvChild *parent)
185
{
186
if (qemu_in_coroutine()) {
187
- bdrv_co_yield_to_drain(bs, true);
188
+ bdrv_co_yield_to_drain(bs, true, parent);
189
return;
190
}
191
192
/* Stop things in parent-to-child order */
193
if (atomic_fetch_inc(&bs->quiesce_counter) == 0) {
194
aio_disable_external(bdrv_get_aio_context(bs));
195
- bdrv_parent_drained_begin(bs);
196
}
197
198
+ bdrv_parent_drained_begin(bs, parent);
199
bdrv_drain_invoke(bs, true, false);
200
bdrv_drain_recurse(bs);
201
}
202
203
-void bdrv_drained_end(BlockDriverState *bs)
204
+void bdrv_drained_begin(BlockDriverState *bs)
205
+{
206
+ bdrv_do_drained_begin(bs, NULL);
207
+}
208
+
209
+static void bdrv_do_drained_end(BlockDriverState *bs, BdrvChild *parent)
210
{
211
int old_quiesce_counter;
212
213
if (qemu_in_coroutine()) {
214
- bdrv_co_yield_to_drain(bs, false);
215
+ bdrv_co_yield_to_drain(bs, false, parent);
216
return;
217
}
218
assert(bs->quiesce_counter > 0);
219
@@ -XXX,XX +XXX,XX @@ void bdrv_drained_end(BlockDriverState *bs)
220
221
/* Re-enable things in child-to-parent order */
222
bdrv_drain_invoke(bs, false, false);
223
+ bdrv_parent_drained_end(bs, parent);
224
if (old_quiesce_counter == 1) {
225
- bdrv_parent_drained_end(bs);
226
aio_enable_external(bdrv_get_aio_context(bs));
227
}
228
}
229
230
+void bdrv_drained_end(BlockDriverState *bs)
231
+{
232
+ bdrv_do_drained_end(bs, NULL);
233
+}
234
+
235
/*
236
* Wait for pending requests to complete on a single BlockDriverState subtree,
237
* and suspend block driver's internal I/O until next request arrives.
238
@@ -XXX,XX +XXX,XX @@ void bdrv_drain_all_begin(void)
239
/* Stop things in parent-to-child order */
240
aio_context_acquire(aio_context);
241
aio_disable_external(aio_context);
242
- bdrv_parent_drained_begin(bs);
243
+ bdrv_parent_drained_begin(bs, NULL);
244
bdrv_drain_invoke(bs, true, true);
245
aio_context_release(aio_context);
246
247
@@ -XXX,XX +XXX,XX @@ void bdrv_drain_all_end(void)
248
/* Re-enable things in child-to-parent order */
249
aio_context_acquire(aio_context);
250
bdrv_drain_invoke(bs, false, true);
251
- bdrv_parent_drained_end(bs);
252
+ bdrv_parent_drained_end(bs, NULL);
253
aio_enable_external(aio_context);
254
aio_context_release(aio_context);
255
}
33
--
256
--
34
2.13.6
257
2.13.6
35
258
36
259
diff view generated by jsdifflib
1
Block job drivers are not expected to mess with the internals of the
1
bdrv_drained_begin() waits for the completion of requests in the whole
2
BlockJob object, so provide wrapper functions for one of the cases where
2
subtree, but it only actually keeps its immediate bs parameter quiesced
3
they still do it: Updating the progress counter.
3
until bdrv_drained_end().
4
5
Add a version that keeps the whole subtree drained. As of this commit,
6
graph changes cannot be allowed during a subtree drained section, but
7
this will be fixed soon.
4
8
5
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
6
Reviewed-by: Eric Blake <eblake@redhat.com>
7
Reviewed-by: Max Reitz <mreitz@redhat.com>
8
Reviewed-by: John Snow <jsnow@redhat.com>
9
---
10
---
10
include/block/blockjob.h | 19 +++++++++++++++++++
11
include/block/block.h | 13 +++++++++++++
11
block/backup.c | 22 +++++++++++++---------
12
block/io.c | 54 ++++++++++++++++++++++++++++++++++++++++-----------
12
block/commit.c | 16 ++++++++--------
13
2 files changed, 56 insertions(+), 11 deletions(-)
13
block/mirror.c | 11 +++++------
14
block/stream.c | 14 ++++++++------
15
blockjob.c | 10 ++++++++++
16
6 files changed, 63 insertions(+), 29 deletions(-)
17
14
18
diff --git a/include/block/blockjob.h b/include/block/blockjob.h
15
diff --git a/include/block/block.h b/include/block/block.h
19
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
20
--- a/include/block/blockjob.h
17
--- a/include/block/block.h
21
+++ b/include/block/blockjob.h
18
+++ b/include/block/block.h
22
@@ -XXX,XX +XXX,XX @@ void block_job_finalize(BlockJob *job, Error **errp);
19
@@ -XXX,XX +XXX,XX @@ void bdrv_parent_drained_end(BlockDriverState *bs, BdrvChild *ignore);
23
void block_job_dismiss(BlockJob **job, Error **errp);
20
void bdrv_drained_begin(BlockDriverState *bs);
24
21
25
/**
22
/**
26
+ * block_job_progress_update:
23
+ * Like bdrv_drained_begin, but recursively begins a quiesced section for
27
+ * @job: The job that has made progress
24
+ * exclusive access to all child nodes as well.
28
+ * @done: How much progress the job made
29
+ *
25
+ *
30
+ * Updates the progress counter of the job.
26
+ * Graph changes are not allowed during a subtree drain section.
31
+ */
27
+ */
32
+void block_job_progress_update(BlockJob *job, uint64_t done);
28
+void bdrv_subtree_drained_begin(BlockDriverState *bs);
33
+
29
+
34
+/**
30
+/**
35
+ * block_job_progress_set_remaining:
31
* bdrv_drained_end:
36
+ * @job: The job whose expected progress end value is set
32
*
37
+ * @remaining: Expected end value of the progress counter of the job
33
* End a quiescent section started by bdrv_drained_begin().
38
+ *
34
*/
39
+ * Sets the expected end value of the progress counter of a job so that a
35
void bdrv_drained_end(BlockDriverState *bs);
40
+ * completion percentage can be calculated when the progress is updated.
36
37
+/**
38
+ * End a quiescent section started by bdrv_subtree_drained_begin().
41
+ */
39
+ */
42
+void block_job_progress_set_remaining(BlockJob *job, uint64_t remaining);
40
+void bdrv_subtree_drained_end(BlockDriverState *bs);
43
+
41
+
44
+/**
42
void bdrv_add_child(BlockDriverState *parent, BlockDriverState *child,
45
* block_job_query:
43
Error **errp);
46
* @job: The job to get information about.
44
void bdrv_del_child(BlockDriverState *parent, BdrvChild *child, Error **errp);
47
*
45
diff --git a/block/io.c b/block/io.c
48
diff --git a/block/backup.c b/block/backup.c
49
index XXXXXXX..XXXXXXX 100644
46
index XXXXXXX..XXXXXXX 100644
50
--- a/block/backup.c
47
--- a/block/io.c
51
+++ b/block/backup.c
48
+++ b/block/io.c
52
@@ -XXX,XX +XXX,XX @@ typedef struct BackupBlockJob {
49
@@ -XXX,XX +XXX,XX @@ typedef struct {
53
BlockdevOnError on_source_error;
50
BlockDriverState *bs;
54
BlockdevOnError on_target_error;
51
bool done;
55
CoRwlock flush_rwlock;
52
bool begin;
56
+ uint64_t len;
53
+ bool recursive;
57
uint64_t bytes_read;
54
BdrvChild *parent;
58
int64_t cluster_size;
55
} BdrvCoDrainData;
59
bool compress;
56
60
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn backup_do_cow(BackupBlockJob *job,
57
@@ -XXX,XX +XXX,XX @@ static bool bdrv_drain_recurse(BlockDriverState *bs)
61
58
return waited;
62
trace_backup_do_cow_process(job, start);
59
}
63
60
64
- n = MIN(job->cluster_size, job->common.len - start);
61
-static void bdrv_do_drained_begin(BlockDriverState *bs, BdrvChild *parent);
65
+ n = MIN(job->cluster_size, job->len - start);
62
-static void bdrv_do_drained_end(BlockDriverState *bs, BdrvChild *parent);
66
63
+static void bdrv_do_drained_begin(BlockDriverState *bs, bool recursive,
67
if (!bounce_buffer) {
64
+ BdrvChild *parent);
68
bounce_buffer = blk_blockalign(blk, job->cluster_size);
65
+static void bdrv_do_drained_end(BlockDriverState *bs, bool recursive,
69
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn backup_do_cow(BackupBlockJob *job,
66
+ BdrvChild *parent);
70
* offset field is an opaque progress value, it is not a disk offset.
67
71
*/
68
static void bdrv_co_drain_bh_cb(void *opaque)
72
job->bytes_read += n;
69
{
73
- job->common.offset += n;
70
@@ -XXX,XX +XXX,XX @@ static void bdrv_co_drain_bh_cb(void *opaque)
74
+ block_job_progress_update(&job->common, n);
71
72
bdrv_dec_in_flight(bs);
73
if (data->begin) {
74
- bdrv_do_drained_begin(bs, data->parent);
75
+ bdrv_do_drained_begin(bs, data->recursive, data->parent);
76
} else {
77
- bdrv_do_drained_end(bs, data->parent);
78
+ bdrv_do_drained_end(bs, data->recursive, data->parent);
75
}
79
}
76
80
77
out:
81
data->done = true;
78
@@ -XXX,XX +XXX,XX @@ void backup_do_checkpoint(BlockJob *job, Error **errp)
82
@@ -XXX,XX +XXX,XX @@ static void bdrv_co_drain_bh_cb(void *opaque)
83
}
84
85
static void coroutine_fn bdrv_co_yield_to_drain(BlockDriverState *bs,
86
- bool begin, BdrvChild *parent)
87
+ bool begin, bool recursive,
88
+ BdrvChild *parent)
89
{
90
BdrvCoDrainData data;
91
92
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn bdrv_co_yield_to_drain(BlockDriverState *bs,
93
.bs = bs,
94
.done = false,
95
.begin = begin,
96
+ .recursive = recursive,
97
.parent = parent,
98
};
99
bdrv_inc_in_flight(bs);
100
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn bdrv_co_yield_to_drain(BlockDriverState *bs,
101
assert(data.done);
102
}
103
104
-static void bdrv_do_drained_begin(BlockDriverState *bs, BdrvChild *parent)
105
+static void bdrv_do_drained_begin(BlockDriverState *bs, bool recursive,
106
+ BdrvChild *parent)
107
{
108
+ BdrvChild *child, *next;
109
+
110
if (qemu_in_coroutine()) {
111
- bdrv_co_yield_to_drain(bs, true, parent);
112
+ bdrv_co_yield_to_drain(bs, true, recursive, parent);
79
return;
113
return;
80
}
114
}
81
115
82
- len = DIV_ROUND_UP(backup_job->common.len, backup_job->cluster_size);
116
@@ -XXX,XX +XXX,XX @@ static void bdrv_do_drained_begin(BlockDriverState *bs, BdrvChild *parent)
83
+ len = DIV_ROUND_UP(backup_job->len, backup_job->cluster_size);
117
bdrv_parent_drained_begin(bs, parent);
84
hbitmap_set(backup_job->copy_bitmap, 0, len);
118
bdrv_drain_invoke(bs, true, false);
119
bdrv_drain_recurse(bs);
120
+
121
+ if (recursive) {
122
+ QLIST_FOREACH_SAFE(child, &bs->children, next, next) {
123
+ bdrv_do_drained_begin(child->bs, true, child);
124
+ }
125
+ }
85
}
126
}
86
127
87
@@ -XXX,XX +XXX,XX @@ static void backup_incremental_init_copy_bitmap(BackupBlockJob *job)
128
void bdrv_drained_begin(BlockDriverState *bs)
88
bdrv_set_dirty_iter(dbi, next_cluster * job->cluster_size);
129
{
89
}
130
- bdrv_do_drained_begin(bs, NULL);
90
131
+ bdrv_do_drained_begin(bs, false, NULL);
91
- job->common.offset = job->common.len -
92
- hbitmap_count(job->copy_bitmap) * job->cluster_size;
93
+ /* TODO block_job_progress_set_remaining() would make more sense */
94
+ block_job_progress_update(&job->common,
95
+ job->len - hbitmap_count(job->copy_bitmap) * job->cluster_size);
96
97
bdrv_dirty_iter_free(dbi);
98
}
99
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn backup_run(void *opaque)
100
QLIST_INIT(&job->inflight_reqs);
101
qemu_co_rwlock_init(&job->flush_rwlock);
102
103
- nb_clusters = DIV_ROUND_UP(job->common.len, job->cluster_size);
104
+ nb_clusters = DIV_ROUND_UP(job->len, job->cluster_size);
105
+ block_job_progress_set_remaining(&job->common, job->len);
106
+
107
job->copy_bitmap = hbitmap_alloc(nb_clusters, 0);
108
if (job->sync_mode == MIRROR_SYNC_MODE_INCREMENTAL) {
109
backup_incremental_init_copy_bitmap(job);
110
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn backup_run(void *opaque)
111
ret = backup_run_incremental(job);
112
} else {
113
/* Both FULL and TOP SYNC_MODE's require copying.. */
114
- for (offset = 0; offset < job->common.len;
115
+ for (offset = 0; offset < job->len;
116
offset += job->cluster_size) {
117
bool error_is_read;
118
int alloced = 0;
119
@@ -XXX,XX +XXX,XX @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
120
goto error;
121
}
122
123
- /* job->common.len is fixed, so we can't allow resize */
124
+ /* job->len is fixed, so we can't allow resize */
125
job = block_job_create(job_id, &backup_job_driver, txn, bs,
126
BLK_PERM_CONSISTENT_READ,
127
BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE |
128
@@ -XXX,XX +XXX,XX @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
129
/* Required permissions are already taken with target's blk_new() */
130
block_job_add_bdrv(&job->common, "target", target, 0, BLK_PERM_ALL,
131
&error_abort);
132
- job->common.len = len;
133
+ job->len = len;
134
135
return &job->common;
136
137
diff --git a/block/commit.c b/block/commit.c
138
index XXXXXXX..XXXXXXX 100644
139
--- a/block/commit.c
140
+++ b/block/commit.c
141
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn commit_run(void *opaque)
142
int64_t n = 0; /* bytes */
143
void *buf = NULL;
144
int bytes_written = 0;
145
- int64_t base_len;
146
+ int64_t len, base_len;
147
148
- ret = s->common.len = blk_getlength(s->top);
149
-
150
- if (s->common.len < 0) {
151
+ ret = len = blk_getlength(s->top);
152
+ if (len < 0) {
153
goto out;
154
}
155
+ block_job_progress_set_remaining(&s->common, len);
156
157
ret = base_len = blk_getlength(s->base);
158
if (base_len < 0) {
159
goto out;
160
}
161
162
- if (base_len < s->common.len) {
163
- ret = blk_truncate(s->base, s->common.len, PREALLOC_MODE_OFF, NULL);
164
+ if (base_len < len) {
165
+ ret = blk_truncate(s->base, len, PREALLOC_MODE_OFF, NULL);
166
if (ret) {
167
goto out;
168
}
169
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn commit_run(void *opaque)
170
171
buf = blk_blockalign(s->top, COMMIT_BUFFER_SIZE);
172
173
- for (offset = 0; offset < s->common.len; offset += n) {
174
+ for (offset = 0; offset < len; offset += n) {
175
bool copy;
176
177
/* Note that even when no rate limit is applied we need to yield
178
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn commit_run(void *opaque)
179
}
180
}
181
/* Publish progress */
182
- s->common.offset += n;
183
+ block_job_progress_update(&s->common, n);
184
185
if (copy && s->common.speed) {
186
delay_ns = ratelimit_calculate_delay(&s->limit, n);
187
diff --git a/block/mirror.c b/block/mirror.c
188
index XXXXXXX..XXXXXXX 100644
189
--- a/block/mirror.c
190
+++ b/block/mirror.c
191
@@ -XXX,XX +XXX,XX @@ static void mirror_iteration_done(MirrorOp *op, int ret)
192
bitmap_set(s->cow_bitmap, chunk_num, nb_chunks);
193
}
194
if (!s->initial_zeroing_ongoing) {
195
- s->common.offset += op->bytes;
196
+ block_job_progress_update(&s->common, op->bytes);
197
}
198
}
199
qemu_iovec_destroy(&op->qiov);
200
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn mirror_run(void *opaque)
201
block_job_pause_point(&s->common);
202
203
cnt = bdrv_get_dirty_count(s->dirty_bitmap);
204
- /* s->common.offset contains the number of bytes already processed so
205
- * far, cnt is the number of dirty bytes remaining and
206
- * s->bytes_in_flight is the number of bytes currently being
207
- * processed; together those are the current total operation length */
208
- s->common.len = s->common.offset + s->bytes_in_flight + cnt;
209
+ /* cnt is the number of dirty bytes remaining and s->bytes_in_flight is
210
+ * the number of bytes currently being processed; together those are
211
+ * the current remaining operation length */
212
+ block_job_progress_set_remaining(&s->common, s->bytes_in_flight + cnt);
213
214
/* Note that even when no rate limit is applied we need to yield
215
* periodically with no pending I/O so that bdrv_drain_all() returns.
216
diff --git a/block/stream.c b/block/stream.c
217
index XXXXXXX..XXXXXXX 100644
218
--- a/block/stream.c
219
+++ b/block/stream.c
220
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn stream_run(void *opaque)
221
BlockBackend *blk = s->common.blk;
222
BlockDriverState *bs = blk_bs(blk);
223
BlockDriverState *base = s->base;
224
+ int64_t len;
225
int64_t offset = 0;
226
uint64_t delay_ns = 0;
227
int error = 0;
228
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn stream_run(void *opaque)
229
goto out;
230
}
231
232
- s->common.len = bdrv_getlength(bs);
233
- if (s->common.len < 0) {
234
- ret = s->common.len;
235
+ len = bdrv_getlength(bs);
236
+ if (len < 0) {
237
+ ret = len;
238
goto out;
239
}
240
+ block_job_progress_set_remaining(&s->common, len);
241
242
buf = qemu_blockalign(bs, STREAM_BUFFER_SIZE);
243
244
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn stream_run(void *opaque)
245
bdrv_enable_copy_on_read(bs);
246
}
247
248
- for ( ; offset < s->common.len; offset += n) {
249
+ for ( ; offset < len; offset += n) {
250
bool copy;
251
252
/* Note that even when no rate limit is applied we need to yield
253
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn stream_run(void *opaque)
254
255
/* Finish early if end of backing file has been reached */
256
if (ret == 0 && n == 0) {
257
- n = s->common.len - offset;
258
+ n = len - offset;
259
}
260
261
copy = (ret == 1);
262
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn stream_run(void *opaque)
263
ret = 0;
264
265
/* Publish progress */
266
- s->common.offset += n;
267
+ block_job_progress_update(&s->common, n);
268
if (copy && s->common.speed) {
269
delay_ns = ratelimit_calculate_delay(&s->limit, n);
270
} else {
271
diff --git a/blockjob.c b/blockjob.c
272
index XXXXXXX..XXXXXXX 100644
273
--- a/blockjob.c
274
+++ b/blockjob.c
275
@@ -XXX,XX +XXX,XX @@ int block_job_complete_sync(BlockJob *job, Error **errp)
276
return block_job_finish_sync(job, &block_job_complete, errp);
277
}
278
279
+void block_job_progress_update(BlockJob *job, uint64_t done)
280
+{
281
+ job->offset += done;
282
+}
132
+}
283
+
133
+
284
+void block_job_progress_set_remaining(BlockJob *job, uint64_t remaining)
134
+void bdrv_subtree_drained_begin(BlockDriverState *bs)
285
+{
135
+{
286
+ job->len = job->offset + remaining;
136
+ bdrv_do_drained_begin(bs, true, NULL);
137
}
138
139
-static void bdrv_do_drained_end(BlockDriverState *bs, BdrvChild *parent)
140
+static void bdrv_do_drained_end(BlockDriverState *bs, bool recursive,
141
+ BdrvChild *parent)
142
{
143
+ BdrvChild *child, *next;
144
int old_quiesce_counter;
145
146
if (qemu_in_coroutine()) {
147
- bdrv_co_yield_to_drain(bs, false, parent);
148
+ bdrv_co_yield_to_drain(bs, false, recursive, parent);
149
return;
150
}
151
assert(bs->quiesce_counter > 0);
152
@@ -XXX,XX +XXX,XX @@ static void bdrv_do_drained_end(BlockDriverState *bs, BdrvChild *parent)
153
if (old_quiesce_counter == 1) {
154
aio_enable_external(bdrv_get_aio_context(bs));
155
}
156
+
157
+ if (recursive) {
158
+ QLIST_FOREACH_SAFE(child, &bs->children, next, next) {
159
+ bdrv_do_drained_end(child->bs, true, child);
160
+ }
161
+ }
162
}
163
164
void bdrv_drained_end(BlockDriverState *bs)
165
{
166
- bdrv_do_drained_end(bs, NULL);
167
+ bdrv_do_drained_end(bs, false, NULL);
287
+}
168
+}
288
+
169
+
289
BlockJobInfo *block_job_query(BlockJob *job, Error **errp)
170
+void bdrv_subtree_drained_end(BlockDriverState *bs)
290
{
171
+{
291
BlockJobInfo *info;
172
+ bdrv_do_drained_end(bs, true, NULL);
173
}
174
175
/*
292
--
176
--
293
2.13.6
177
2.13.6
294
178
295
179
diff view generated by jsdifflib
1
From: Eric Blake <eblake@redhat.com>
1
Add a subtree drain version to the existing test cases.
2
2
3
We are gradually moving away from sector-based interfaces, towards
4
byte-based. Make the change for the last few sector-based callbacks
5
in the file-win32 driver.
6
7
Note that the driver was already using byte-based calls for
8
performing actual I/O, so this just gets rid of a round trip
9
of scaling; however, as I don't know if Windows is tolerant of
10
non-sector AIO operations, I went with the conservative approach
11
of modifying .bdrv_refresh_limits to override the block layer
12
defaults back to the pre-patch value of 512.
13
14
Signed-off-by: Eric Blake <eblake@redhat.com>
15
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
3
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
16
---
4
---
17
include/block/raw-aio.h | 2 +-
5
tests/test-bdrv-drain.c | 27 ++++++++++++++++++++++++++-
18
block/file-win32.c | 47 +++++++++++++++++++++++++++++------------------
6
1 file changed, 26 insertions(+), 1 deletion(-)
19
block/win32-aio.c | 5 ++---
20
3 files changed, 32 insertions(+), 22 deletions(-)
21
7
22
diff --git a/include/block/raw-aio.h b/include/block/raw-aio.h
8
diff --git a/tests/test-bdrv-drain.c b/tests/test-bdrv-drain.c
23
index XXXXXXX..XXXXXXX 100644
9
index XXXXXXX..XXXXXXX 100644
24
--- a/include/block/raw-aio.h
10
--- a/tests/test-bdrv-drain.c
25
+++ b/include/block/raw-aio.h
11
+++ b/tests/test-bdrv-drain.c
26
@@ -XXX,XX +XXX,XX @@ void win32_aio_cleanup(QEMUWin32AIOState *aio);
12
@@ -XXX,XX +XXX,XX @@ static void aio_ret_cb(void *opaque, int ret)
27
int win32_aio_attach(QEMUWin32AIOState *aio, HANDLE hfile);
13
enum drain_type {
28
BlockAIOCB *win32_aio_submit(BlockDriverState *bs,
14
BDRV_DRAIN_ALL,
29
QEMUWin32AIOState *aio, HANDLE hfile,
15
BDRV_DRAIN,
30
- int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
16
+ BDRV_SUBTREE_DRAIN,
31
+ uint64_t offset, uint64_t bytes, QEMUIOVector *qiov,
17
DRAIN_TYPE_MAX,
32
BlockCompletionFunc *cb, void *opaque, int type);
18
};
33
void win32_aio_detach_aio_context(QEMUWin32AIOState *aio,
19
34
AioContext *old_context);
20
@@ -XXX,XX +XXX,XX @@ static void do_drain_begin(enum drain_type drain_type, BlockDriverState *bs)
35
diff --git a/block/file-win32.c b/block/file-win32.c
21
switch (drain_type) {
36
index XXXXXXX..XXXXXXX 100644
22
case BDRV_DRAIN_ALL: bdrv_drain_all_begin(); break;
37
--- a/block/file-win32.c
23
case BDRV_DRAIN: bdrv_drained_begin(bs); break;
38
+++ b/block/file-win32.c
24
+ case BDRV_SUBTREE_DRAIN: bdrv_subtree_drained_begin(bs); break;
39
@@ -XXX,XX +XXX,XX @@ static void raw_probe_alignment(BlockDriverState *bs, Error **errp)
25
default: g_assert_not_reached();
40
&dg.Geometry.BytesPerSector,
41
&freeClusters, &totalClusters);
42
bs->bl.request_alignment = dg.Geometry.BytesPerSector;
43
+ return;
44
}
45
+
46
+ /* XXX Does Windows support AIO on less than 512-byte alignment? */
47
+ bs->bl.request_alignment = 512;
48
}
49
50
static void raw_parse_flags(int flags, bool use_aio, int *access_flags,
51
@@ -XXX,XX +XXX,XX @@ fail:
52
return ret;
53
}
54
55
-static BlockAIOCB *raw_aio_readv(BlockDriverState *bs,
56
- int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
57
- BlockCompletionFunc *cb, void *opaque)
58
+static BlockAIOCB *raw_aio_preadv(BlockDriverState *bs,
59
+ uint64_t offset, uint64_t bytes,
60
+ QEMUIOVector *qiov, int flags,
61
+ BlockCompletionFunc *cb, void *opaque)
62
{
63
BDRVRawState *s = bs->opaque;
64
if (s->aio) {
65
- return win32_aio_submit(bs, s->aio, s->hfile, sector_num, qiov,
66
- nb_sectors, cb, opaque, QEMU_AIO_READ);
67
+ return win32_aio_submit(bs, s->aio, s->hfile, offset, bytes, qiov,
68
+ cb, opaque, QEMU_AIO_READ);
69
} else {
70
- return paio_submit(bs, s->hfile, sector_num << BDRV_SECTOR_BITS, qiov,
71
- nb_sectors << BDRV_SECTOR_BITS,
72
+ return paio_submit(bs, s->hfile, offset, qiov, bytes,
73
cb, opaque, QEMU_AIO_READ);
74
}
26
}
75
}
27
}
76
28
@@ -XXX,XX +XXX,XX @@ static void do_drain_end(enum drain_type drain_type, BlockDriverState *bs)
77
-static BlockAIOCB *raw_aio_writev(BlockDriverState *bs,
29
switch (drain_type) {
78
- int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
30
case BDRV_DRAIN_ALL: bdrv_drain_all_end(); break;
79
- BlockCompletionFunc *cb, void *opaque)
31
case BDRV_DRAIN: bdrv_drained_end(bs); break;
80
+static BlockAIOCB *raw_aio_pwritev(BlockDriverState *bs,
32
+ case BDRV_SUBTREE_DRAIN: bdrv_subtree_drained_end(bs); break;
81
+ uint64_t offset, uint64_t bytes,
33
default: g_assert_not_reached();
82
+ QEMUIOVector *qiov, int flags,
83
+ BlockCompletionFunc *cb, void *opaque)
84
{
85
BDRVRawState *s = bs->opaque;
86
if (s->aio) {
87
- return win32_aio_submit(bs, s->aio, s->hfile, sector_num, qiov,
88
- nb_sectors, cb, opaque, QEMU_AIO_WRITE);
89
+ return win32_aio_submit(bs, s->aio, s->hfile, offset, bytes, qiov,
90
+ cb, opaque, QEMU_AIO_WRITE);
91
} else {
92
- return paio_submit(bs, s->hfile, sector_num << BDRV_SECTOR_BITS, qiov,
93
- nb_sectors << BDRV_SECTOR_BITS,
94
+ return paio_submit(bs, s->hfile, offset, qiov, bytes,
95
cb, opaque, QEMU_AIO_WRITE);
96
}
34
}
97
}
35
}
98
@@ -XXX,XX +XXX,XX @@ BlockDriver bdrv_file = {
36
@@ -XXX,XX +XXX,XX @@ static void test_drv_cb_drain(void)
99
.bdrv_co_create_opts = raw_co_create_opts,
37
test_drv_cb_common(BDRV_DRAIN, false);
100
.bdrv_has_zero_init = bdrv_has_zero_init_1,
101
102
- .bdrv_aio_readv = raw_aio_readv,
103
- .bdrv_aio_writev = raw_aio_writev,
104
+ .bdrv_aio_preadv = raw_aio_preadv,
105
+ .bdrv_aio_pwritev = raw_aio_pwritev,
106
.bdrv_aio_flush = raw_aio_flush,
107
108
.bdrv_truncate    = raw_truncate,
109
@@ -XXX,XX +XXX,XX @@ static void hdev_parse_filename(const char *filename, QDict *options,
110
bdrv_parse_filename_strip_prefix(filename, "host_device:", options);
111
}
38
}
112
39
113
+static void hdev_refresh_limits(BlockDriverState *bs, Error **errp)
40
+static void test_drv_cb_drain_subtree(void)
114
+{
41
+{
115
+ /* XXX Does Windows support AIO on less than 512-byte alignment? */
42
+ test_drv_cb_common(BDRV_SUBTREE_DRAIN, true);
116
+ bs->bl.request_alignment = 512;
117
+}
43
+}
118
+
44
+
119
static int hdev_open(BlockDriverState *bs, QDict *options, int flags,
45
static void test_quiesce_common(enum drain_type drain_type, bool recursive)
120
Error **errp)
121
{
46
{
122
@@ -XXX,XX +XXX,XX @@ static BlockDriver bdrv_host_device = {
47
BlockBackend *blk;
123
.bdrv_probe_device    = hdev_probe_device,
48
@@ -XXX,XX +XXX,XX @@ static void test_quiesce_drain(void)
124
.bdrv_file_open    = hdev_open,
49
test_quiesce_common(BDRV_DRAIN, false);
125
.bdrv_close        = raw_close,
50
}
126
+ .bdrv_refresh_limits = hdev_refresh_limits,
51
127
52
+static void test_quiesce_drain_subtree(void)
128
- .bdrv_aio_readv = raw_aio_readv,
53
+{
129
- .bdrv_aio_writev = raw_aio_writev,
54
+ test_quiesce_common(BDRV_SUBTREE_DRAIN, true);
130
+ .bdrv_aio_preadv = raw_aio_preadv,
55
+}
131
+ .bdrv_aio_pwritev = raw_aio_pwritev,
56
+
132
.bdrv_aio_flush = raw_aio_flush,
57
static void test_nested(void)
133
134
.bdrv_detach_aio_context = raw_detach_aio_context,
135
diff --git a/block/win32-aio.c b/block/win32-aio.c
136
index XXXXXXX..XXXXXXX 100644
137
--- a/block/win32-aio.c
138
+++ b/block/win32-aio.c
139
@@ -XXX,XX +XXX,XX @@ static const AIOCBInfo win32_aiocb_info = {
140
141
BlockAIOCB *win32_aio_submit(BlockDriverState *bs,
142
QEMUWin32AIOState *aio, HANDLE hfile,
143
- int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
144
+ uint64_t offset, uint64_t bytes, QEMUIOVector *qiov,
145
BlockCompletionFunc *cb, void *opaque, int type)
146
{
58
{
147
struct QEMUWin32AIOCB *waiocb;
59
BlockBackend *blk;
148
- uint64_t offset = sector_num * 512;
60
@@ -XXX,XX +XXX,XX @@ static void test_nested(void)
149
DWORD rc;
61
/* XXX bdrv_drain_all() doesn't increase the quiesce_counter */
150
62
int bs_quiesce = (outer != BDRV_DRAIN_ALL) +
151
waiocb = qemu_aio_get(&win32_aiocb_info, bs, cb, opaque);
63
(inner != BDRV_DRAIN_ALL);
152
- waiocb->nbytes = nb_sectors * 512;
64
- int backing_quiesce = 0;
153
+ waiocb->nbytes = bytes;
65
+ int backing_quiesce = (outer == BDRV_SUBTREE_DRAIN) +
154
waiocb->qiov = qiov;
66
+ (inner == BDRV_SUBTREE_DRAIN);
155
waiocb->is_read = (type == QEMU_AIO_READ);
67
int backing_cb_cnt = (outer != BDRV_DRAIN) +
156
68
(inner != BDRV_DRAIN);
69
70
@@ -XXX,XX +XXX,XX @@ static void test_blockjob_drain(void)
71
test_blockjob_common(BDRV_DRAIN);
72
}
73
74
+static void test_blockjob_drain_subtree(void)
75
+{
76
+ test_blockjob_common(BDRV_SUBTREE_DRAIN);
77
+}
78
+
79
int main(int argc, char **argv)
80
{
81
bdrv_init();
82
@@ -XXX,XX +XXX,XX @@ int main(int argc, char **argv)
83
84
g_test_add_func("/bdrv-drain/driver-cb/drain_all", test_drv_cb_drain_all);
85
g_test_add_func("/bdrv-drain/driver-cb/drain", test_drv_cb_drain);
86
+ g_test_add_func("/bdrv-drain/driver-cb/drain_subtree",
87
+ test_drv_cb_drain_subtree);
88
89
g_test_add_func("/bdrv-drain/quiesce/drain_all", test_quiesce_drain_all);
90
g_test_add_func("/bdrv-drain/quiesce/drain", test_quiesce_drain);
91
+ g_test_add_func("/bdrv-drain/quiesce/drain_subtree",
92
+ test_quiesce_drain_subtree);
93
94
g_test_add_func("/bdrv-drain/nested", test_nested);
95
96
g_test_add_func("/bdrv-drain/blockjob/drain_all", test_blockjob_drain_all);
97
g_test_add_func("/bdrv-drain/blockjob/drain", test_blockjob_drain);
98
+ g_test_add_func("/bdrv-drain/blockjob/drain_subtree",
99
+ test_blockjob_drain_subtree);
100
101
return g_test_run();
102
}
157
--
103
--
158
2.13.6
104
2.13.6
159
105
160
106
diff view generated by jsdifflib
1
From: Max Reitz <mreitz@redhat.com>
1
If bdrv_do_drained_begin/end() are called in coroutine context, they
2
first use a BH to get out of the coroutine context. Call some existing
3
tests again from a coroutine to cover this code path.
2
4
3
Commit abd3622cc03cf41ed542126a540385f30a4c0175 added a case to 122
5
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
4
regarding how the qcow2 driver handles an incorrect compressed data
6
---
5
length value. This does not really fit into 122, as that file is
7
tests/test-bdrv-drain.c | 59 +++++++++++++++++++++++++++++++++++++++++++++++++
6
supposed to contain qemu-img convert test cases, which this case is not.
8
1 file changed, 59 insertions(+)
7
So this patch splits it off into its own file; maybe we will even get
8
more qcow2-only compression tests in the future.
9
9
10
Also, that test case does not work with refcount_bits=1, so mark that
10
diff --git a/tests/test-bdrv-drain.c b/tests/test-bdrv-drain.c
11
option as unsupported.
12
13
Signed-off-by: Max Reitz <mreitz@redhat.com>
14
Message-id: 20180406164108.26118-1-mreitz@redhat.com
15
Reviewed-by: Eric Blake <eblake@redhat.com>
16
Signed-off-by: Alberto Garcia <berto@igalia.com>
17
Signed-off-by: Max Reitz <mreitz@redhat.com>
18
---
19
tests/qemu-iotests/122 | 47 ----------------------
20
tests/qemu-iotests/122.out | 33 ----------------
21
tests/qemu-iotests/214 | 97 ++++++++++++++++++++++++++++++++++++++++++++++
22
tests/qemu-iotests/214.out | 35 +++++++++++++++++
23
tests/qemu-iotests/group | 1 +
24
5 files changed, 133 insertions(+), 80 deletions(-)
25
create mode 100755 tests/qemu-iotests/214
26
create mode 100644 tests/qemu-iotests/214.out
27
28
diff --git a/tests/qemu-iotests/122 b/tests/qemu-iotests/122
29
index XXXXXXX..XXXXXXX 100755
30
--- a/tests/qemu-iotests/122
31
+++ b/tests/qemu-iotests/122
32
@@ -XXX,XX +XXX,XX @@ $QEMU_IO -c "read -P 0 1024k 1022k" "$TEST_IMG" 2>&1 | _filter_qemu_io | _fil
33
34
35
echo
36
-echo "=== Corrupted size field in compressed cluster descriptor ==="
37
-echo
38
-# Create an empty image and fill half of it with compressed data.
39
-# The L2 entries of the two compressed clusters are located at
40
-# 0x800000 and 0x800008, their original values are 0x4008000000a00000
41
-# and 0x4008000000a00802 (5 sectors for compressed data each).
42
-_make_test_img 8M -o cluster_size=2M
43
-$QEMU_IO -c "write -c -P 0x11 0 2M" -c "write -c -P 0x11 2M 2M" "$TEST_IMG" \
44
- 2>&1 | _filter_qemu_io | _filter_testdir
45
-
46
-# Reduce size of compressed data to 4 sectors: this corrupts the image.
47
-poke_file "$TEST_IMG" $((0x800000)) "\x40\x06"
48
-$QEMU_IO -c "read -P 0x11 0 4M" "$TEST_IMG" 2>&1 | _filter_qemu_io | _filter_testdir
49
-
50
-# 'qemu-img check' however doesn't see anything wrong because it
51
-# doesn't try to decompress the data and the refcounts are consistent.
52
-# TODO: update qemu-img so this can be detected.
53
-_check_test_img
54
-
55
-# Increase size of compressed data to the maximum (8192 sectors).
56
-# This makes QEMU read more data (8192 sectors instead of 5, host
57
-# addresses [0xa00000, 0xdfffff]), but the decompression algorithm
58
-# stops once we have enough to restore the uncompressed cluster, so
59
-# the rest of the data is ignored.
60
-poke_file "$TEST_IMG" $((0x800000)) "\x7f\xfe"
61
-# Do it also for the second compressed cluster (L2 entry at 0x800008).
62
-# In this case the compressed data would span 3 host clusters
63
-# (host addresses: [0xa00802, 0xe00801])
64
-poke_file "$TEST_IMG" $((0x800008)) "\x7f\xfe"
65
-
66
-# Here the image is too small so we're asking QEMU to read beyond the
67
-# end of the image.
68
-$QEMU_IO -c "read -P 0x11 0 4M" "$TEST_IMG" 2>&1 | _filter_qemu_io | _filter_testdir
69
-# But if we grow the image we won't be reading beyond its end anymore.
70
-$QEMU_IO -c "write -P 0x22 4M 4M" "$TEST_IMG" 2>&1 | _filter_qemu_io | _filter_testdir
71
-$QEMU_IO -c "read -P 0x11 0 4M" "$TEST_IMG" 2>&1 | _filter_qemu_io | _filter_testdir
72
-
73
-# The refcount data is however wrong because due to the increased size
74
-# of the compressed data it now reaches the following host clusters.
75
-# This can be repaired by qemu-img check by increasing the refcount of
76
-# those clusters.
77
-# TODO: update qemu-img to correct the compressed cluster size instead.
78
-_check_test_img -r all
79
-$QEMU_IO -c "read -P 0x11 0 4M" "$TEST_IMG" 2>&1 | _filter_qemu_io | _filter_testdir
80
-$QEMU_IO -c "read -P 0x22 4M 4M" "$TEST_IMG" 2>&1 | _filter_qemu_io | _filter_testdir
81
-
82
-echo
83
echo "=== Full allocation with -S 0 ==="
84
echo
85
86
diff --git a/tests/qemu-iotests/122.out b/tests/qemu-iotests/122.out
87
index XXXXXXX..XXXXXXX 100644
11
index XXXXXXX..XXXXXXX 100644
88
--- a/tests/qemu-iotests/122.out
12
--- a/tests/test-bdrv-drain.c
89
+++ b/tests/qemu-iotests/122.out
13
+++ b/tests/test-bdrv-drain.c
90
@@ -XXX,XX +XXX,XX @@ read 1024/1024 bytes at offset 1047552
14
@@ -XXX,XX +XXX,XX @@ static void aio_ret_cb(void *opaque, int ret)
91
read 1046528/1046528 bytes at offset 1048576
15
*aio_ret = ret;
92
1022 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
16
}
93
17
94
-=== Corrupted size field in compressed cluster descriptor ===
18
+typedef struct CallInCoroutineData {
95
-
19
+ void (*entry)(void);
96
-Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=8388608
20
+ bool done;
97
-wrote 2097152/2097152 bytes at offset 0
21
+} CallInCoroutineData;
98
-2 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
99
-wrote 2097152/2097152 bytes at offset 2097152
100
-2 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
101
-read failed: Input/output error
102
-No errors were found on the image.
103
-read 4194304/4194304 bytes at offset 0
104
-4 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
105
-wrote 4194304/4194304 bytes at offset 4194304
106
-4 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
107
-read 4194304/4194304 bytes at offset 0
108
-4 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
109
-ERROR cluster 6 refcount=1 reference=3
110
-ERROR cluster 7 refcount=1 reference=2
111
-Repairing cluster 6 refcount=1 reference=3
112
-Repairing cluster 7 refcount=1 reference=2
113
-Repairing OFLAG_COPIED data cluster: l2_entry=8000000000c00000 refcount=3
114
-Repairing OFLAG_COPIED data cluster: l2_entry=8000000000e00000 refcount=2
115
-The following inconsistencies were found and repaired:
116
-
117
- 0 leaked clusters
118
- 4 corruptions
119
-
120
-Double checking the fixed image now...
121
-No errors were found on the image.
122
-read 4194304/4194304 bytes at offset 0
123
-4 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
124
-read 4194304/4194304 bytes at offset 4194304
125
-4 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
126
-
127
=== Full allocation with -S 0 ===
128
129
Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864
130
diff --git a/tests/qemu-iotests/214 b/tests/qemu-iotests/214
131
new file mode 100755
132
index XXXXXXX..XXXXXXX
133
--- /dev/null
134
+++ b/tests/qemu-iotests/214
135
@@ -XXX,XX +XXX,XX @@
136
+#!/bin/bash
137
+#
138
+# Test qcow2 image compression
139
+#
140
+# Copyright (C) 2018 Igalia, S.L.
141
+# Author: Alberto Garcia <berto@igalia.com>
142
+#
143
+# This program is free software; you can redistribute it and/or modify
144
+# it under the terms of the GNU General Public License as published by
145
+# the Free Software Foundation; either version 2 of the License, or
146
+# (at your option) any later version.
147
+#
148
+# This program is distributed in the hope that it will be useful,
149
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
150
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
151
+# GNU General Public License for more details.
152
+#
153
+# You should have received a copy of the GNU General Public License
154
+# along with this program. If not, see <http://www.gnu.org/licenses/>.
155
+#
156
+
22
+
157
+seq=$(basename "$0")
23
+static coroutine_fn void call_in_coroutine_entry(void *opaque)
158
+echo "QA output created by $seq"
24
+{
25
+ CallInCoroutineData *data = opaque;
159
+
26
+
160
+here=$PWD
27
+ data->entry();
161
+status=1    # failure is the default!
28
+ data->done = true;
29
+}
162
+
30
+
163
+_cleanup()
31
+static void call_in_coroutine(void (*entry)(void))
164
+{
32
+{
165
+ _cleanup_test_img
33
+ Coroutine *co;
34
+ CallInCoroutineData data = {
35
+ .entry = entry,
36
+ .done = false,
37
+ };
38
+
39
+ co = qemu_coroutine_create(call_in_coroutine_entry, &data);
40
+ qemu_coroutine_enter(co);
41
+ while (!data.done) {
42
+ aio_poll(qemu_get_aio_context(), true);
43
+ }
166
+}
44
+}
167
+trap "_cleanup; exit \$status" 0 1 2 3 15
168
+
45
+
169
+# get standard environment, filters and checks
46
enum drain_type {
170
+. ./common.rc
47
BDRV_DRAIN_ALL,
171
+. ./common.filter
48
BDRV_DRAIN,
49
@@ -XXX,XX +XXX,XX @@ static void test_drv_cb_drain_subtree(void)
50
test_drv_cb_common(BDRV_SUBTREE_DRAIN, true);
51
}
52
53
+static void test_drv_cb_co_drain(void)
54
+{
55
+ call_in_coroutine(test_drv_cb_drain);
56
+}
172
+
57
+
173
+_supported_fmt qcow2
58
+static void test_drv_cb_co_drain_subtree(void)
174
+_supported_proto file
59
+{
175
+_supported_os Linux
60
+ call_in_coroutine(test_drv_cb_drain_subtree);
61
+}
176
+
62
+
177
+# Repairing the corrupted image requires qemu-img check to store a
63
static void test_quiesce_common(enum drain_type drain_type, bool recursive)
178
+# refcount up to 3, which requires at least two refcount bits.
64
{
179
+_unsupported_imgopts 'refcount_bits=1[^0-9]'
65
BlockBackend *blk;
66
@@ -XXX,XX +XXX,XX @@ static void test_quiesce_drain_subtree(void)
67
test_quiesce_common(BDRV_SUBTREE_DRAIN, true);
68
}
69
70
+static void test_quiesce_co_drain(void)
71
+{
72
+ call_in_coroutine(test_quiesce_drain);
73
+}
74
+
75
+static void test_quiesce_co_drain_subtree(void)
76
+{
77
+ call_in_coroutine(test_quiesce_drain_subtree);
78
+}
79
+
80
static void test_nested(void)
81
{
82
BlockBackend *blk;
83
@@ -XXX,XX +XXX,XX @@ int main(int argc, char **argv)
84
g_test_add_func("/bdrv-drain/driver-cb/drain_subtree",
85
test_drv_cb_drain_subtree);
86
87
+ // XXX bdrv_drain_all() doesn't work in coroutine context
88
+ g_test_add_func("/bdrv-drain/driver-cb/co/drain", test_drv_cb_co_drain);
89
+ g_test_add_func("/bdrv-drain/driver-cb/co/drain_subtree",
90
+ test_drv_cb_co_drain_subtree);
180
+
91
+
181
+
92
+
182
+echo
93
g_test_add_func("/bdrv-drain/quiesce/drain_all", test_quiesce_drain_all);
183
+echo "=== Corrupted size field in compressed cluster descriptor ==="
94
g_test_add_func("/bdrv-drain/quiesce/drain", test_quiesce_drain);
184
+echo
95
g_test_add_func("/bdrv-drain/quiesce/drain_subtree",
185
+# Create an empty image and fill half of it with compressed data.
96
test_quiesce_drain_subtree);
186
+# The L2 entries of the two compressed clusters are located at
97
187
+# 0x800000 and 0x800008, their original values are 0x4008000000a00000
98
+ // XXX bdrv_drain_all() doesn't work in coroutine context
188
+# and 0x4008000000a00802 (5 sectors for compressed data each).
99
+ g_test_add_func("/bdrv-drain/quiesce/co/drain", test_quiesce_co_drain);
189
+_make_test_img 8M -o cluster_size=2M
100
+ g_test_add_func("/bdrv-drain/quiesce/co/drain_subtree",
190
+$QEMU_IO -c "write -c -P 0x11 0 2M" -c "write -c -P 0x11 2M 2M" "$TEST_IMG" \
101
+ test_quiesce_co_drain_subtree);
191
+ 2>&1 | _filter_qemu_io | _filter_testdir
192
+
102
+
193
+# Reduce size of compressed data to 4 sectors: this corrupts the image.
103
g_test_add_func("/bdrv-drain/nested", test_nested);
194
+poke_file "$TEST_IMG" $((0x800000)) "\x40\x06"
104
195
+$QEMU_IO -c "read -P 0x11 0 4M" "$TEST_IMG" 2>&1 | _filter_qemu_io | _filter_testdir
105
g_test_add_func("/bdrv-drain/blockjob/drain_all", test_blockjob_drain_all);
196
+
197
+# 'qemu-img check' however doesn't see anything wrong because it
198
+# doesn't try to decompress the data and the refcounts are consistent.
199
+# TODO: update qemu-img so this can be detected.
200
+_check_test_img
201
+
202
+# Increase size of compressed data to the maximum (8192 sectors).
203
+# This makes QEMU read more data (8192 sectors instead of 5, host
204
+# addresses [0xa00000, 0xdfffff]), but the decompression algorithm
205
+# stops once we have enough to restore the uncompressed cluster, so
206
+# the rest of the data is ignored.
207
+poke_file "$TEST_IMG" $((0x800000)) "\x7f\xfe"
208
+# Do it also for the second compressed cluster (L2 entry at 0x800008).
209
+# In this case the compressed data would span 3 host clusters
210
+# (host addresses: [0xa00802, 0xe00801])
211
+poke_file "$TEST_IMG" $((0x800008)) "\x7f\xfe"
212
+
213
+# Here the image is too small so we're asking QEMU to read beyond the
214
+# end of the image.
215
+$QEMU_IO -c "read -P 0x11 0 4M" "$TEST_IMG" 2>&1 | _filter_qemu_io | _filter_testdir
216
+# But if we grow the image we won't be reading beyond its end anymore.
217
+$QEMU_IO -c "write -P 0x22 4M 4M" "$TEST_IMG" 2>&1 | _filter_qemu_io | _filter_testdir
218
+$QEMU_IO -c "read -P 0x11 0 4M" "$TEST_IMG" 2>&1 | _filter_qemu_io | _filter_testdir
219
+
220
+# The refcount data is however wrong because due to the increased size
221
+# of the compressed data it now reaches the following host clusters.
222
+# This can be repaired by qemu-img check by increasing the refcount of
223
+# those clusters.
224
+# TODO: update qemu-img to correct the compressed cluster size instead.
225
+_check_test_img -r all
226
+$QEMU_IO -c "read -P 0x11 0 4M" "$TEST_IMG" 2>&1 | _filter_qemu_io | _filter_testdir
227
+$QEMU_IO -c "read -P 0x22 4M 4M" "$TEST_IMG" 2>&1 | _filter_qemu_io | _filter_testdir
228
+
229
+# success, all done
230
+echo '*** done'
231
+rm -f $seq.full
232
+status=0
233
diff --git a/tests/qemu-iotests/214.out b/tests/qemu-iotests/214.out
234
new file mode 100644
235
index XXXXXXX..XXXXXXX
236
--- /dev/null
237
+++ b/tests/qemu-iotests/214.out
238
@@ -XXX,XX +XXX,XX @@
239
+QA output created by 214
240
+
241
+=== Corrupted size field in compressed cluster descriptor ===
242
+
243
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=8388608
244
+wrote 2097152/2097152 bytes at offset 0
245
+2 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
246
+wrote 2097152/2097152 bytes at offset 2097152
247
+2 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
248
+read failed: Input/output error
249
+No errors were found on the image.
250
+read 4194304/4194304 bytes at offset 0
251
+4 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
252
+wrote 4194304/4194304 bytes at offset 4194304
253
+4 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
254
+read 4194304/4194304 bytes at offset 0
255
+4 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
256
+ERROR cluster 6 refcount=1 reference=3
257
+ERROR cluster 7 refcount=1 reference=2
258
+Repairing cluster 6 refcount=1 reference=3
259
+Repairing cluster 7 refcount=1 reference=2
260
+Repairing OFLAG_COPIED data cluster: l2_entry=8000000000c00000 refcount=3
261
+Repairing OFLAG_COPIED data cluster: l2_entry=8000000000e00000 refcount=2
262
+The following inconsistencies were found and repaired:
263
+
264
+ 0 leaked clusters
265
+ 4 corruptions
266
+
267
+Double checking the fixed image now...
268
+No errors were found on the image.
269
+read 4194304/4194304 bytes at offset 0
270
+4 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
271
+read 4194304/4194304 bytes at offset 4194304
272
+4 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
273
+*** done
274
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
275
index XXXXXXX..XXXXXXX 100644
276
--- a/tests/qemu-iotests/group
277
+++ b/tests/qemu-iotests/group
278
@@ -XXX,XX +XXX,XX @@
279
211 rw auto quick
280
212 rw auto quick
281
213 rw auto quick
282
+214 rw auto
283
218 rw auto quick
284
--
106
--
285
2.13.6
107
2.13.6
286
108
287
109
diff view generated by jsdifflib
1
From: Eric Blake <eblake@redhat.com>
1
Test that drain sections are correctly propagated through the graph.
2
2
3
We are gradually moving away from sector-based interfaces, towards
4
byte-based. Make the change for the last few sector-based callbacks
5
in the vxhs driver.
6
7
Note that the driver was already using byte-based calls for
8
performing actual I/O, so this just gets rid of a round trip
9
of scaling; however, as I don't know if VxHS is tolerant of
10
non-sector AIO operations, I went with the conservative approach
11
of adding .bdrv_refresh_limits to override the block layer
12
defaults back to the pre-patch value of 512.
13
14
Signed-off-by: Eric Blake <eblake@redhat.com>
15
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
3
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
16
---
4
---
17
block/vxhs.c | 43 ++++++++++++++++++++++---------------------
5
tests/test-bdrv-drain.c | 74 +++++++++++++++++++++++++++++++++++++++++++++++++
18
1 file changed, 22 insertions(+), 21 deletions(-)
6
1 file changed, 74 insertions(+)
19
7
20
diff --git a/block/vxhs.c b/block/vxhs.c
8
diff --git a/tests/test-bdrv-drain.c b/tests/test-bdrv-drain.c
21
index XXXXXXX..XXXXXXX 100644
9
index XXXXXXX..XXXXXXX 100644
22
--- a/block/vxhs.c
10
--- a/tests/test-bdrv-drain.c
23
+++ b/block/vxhs.c
11
+++ b/tests/test-bdrv-drain.c
24
@@ -XXX,XX +XXX,XX @@ static void vxhs_parse_filename(const char *filename, QDict *options,
12
@@ -XXX,XX +XXX,XX @@ static void test_nested(void)
25
}
13
blk_unref(blk);
26
}
14
}
27
15
28
+static void vxhs_refresh_limits(BlockDriverState *bs, Error **errp)
16
+static void test_multiparent(void)
29
+{
17
+{
30
+ /* XXX Does VXHS support AIO on less than 512-byte alignment? */
18
+ BlockBackend *blk_a, *blk_b;
31
+ bs->bl.request_alignment = 512;
19
+ BlockDriverState *bs_a, *bs_b, *backing;
20
+ BDRVTestState *a_s, *b_s, *backing_s;
21
+
22
+ blk_a = blk_new(BLK_PERM_ALL, BLK_PERM_ALL);
23
+ bs_a = bdrv_new_open_driver(&bdrv_test, "test-node-a", BDRV_O_RDWR,
24
+ &error_abort);
25
+ a_s = bs_a->opaque;
26
+ blk_insert_bs(blk_a, bs_a, &error_abort);
27
+
28
+ blk_b = blk_new(BLK_PERM_ALL, BLK_PERM_ALL);
29
+ bs_b = bdrv_new_open_driver(&bdrv_test, "test-node-b", BDRV_O_RDWR,
30
+ &error_abort);
31
+ b_s = bs_b->opaque;
32
+ blk_insert_bs(blk_b, bs_b, &error_abort);
33
+
34
+ backing = bdrv_new_open_driver(&bdrv_test, "backing", 0, &error_abort);
35
+ backing_s = backing->opaque;
36
+ bdrv_set_backing_hd(bs_a, backing, &error_abort);
37
+ bdrv_set_backing_hd(bs_b, backing, &error_abort);
38
+
39
+ g_assert_cmpint(bs_a->quiesce_counter, ==, 0);
40
+ g_assert_cmpint(bs_b->quiesce_counter, ==, 0);
41
+ g_assert_cmpint(backing->quiesce_counter, ==, 0);
42
+ g_assert_cmpint(a_s->drain_count, ==, 0);
43
+ g_assert_cmpint(b_s->drain_count, ==, 0);
44
+ g_assert_cmpint(backing_s->drain_count, ==, 0);
45
+
46
+ do_drain_begin(BDRV_SUBTREE_DRAIN, bs_a);
47
+
48
+ g_assert_cmpint(bs_a->quiesce_counter, ==, 1);
49
+ g_assert_cmpint(bs_b->quiesce_counter, ==, 1);
50
+ g_assert_cmpint(backing->quiesce_counter, ==, 1);
51
+ g_assert_cmpint(a_s->drain_count, ==, 1);
52
+ g_assert_cmpint(b_s->drain_count, ==, 1);
53
+ g_assert_cmpint(backing_s->drain_count, ==, 1);
54
+
55
+ do_drain_begin(BDRV_SUBTREE_DRAIN, bs_b);
56
+
57
+ g_assert_cmpint(bs_a->quiesce_counter, ==, 2);
58
+ g_assert_cmpint(bs_b->quiesce_counter, ==, 2);
59
+ g_assert_cmpint(backing->quiesce_counter, ==, 2);
60
+ g_assert_cmpint(a_s->drain_count, ==, 2);
61
+ g_assert_cmpint(b_s->drain_count, ==, 2);
62
+ g_assert_cmpint(backing_s->drain_count, ==, 2);
63
+
64
+ do_drain_end(BDRV_SUBTREE_DRAIN, bs_b);
65
+
66
+ g_assert_cmpint(bs_a->quiesce_counter, ==, 1);
67
+ g_assert_cmpint(bs_b->quiesce_counter, ==, 1);
68
+ g_assert_cmpint(backing->quiesce_counter, ==, 1);
69
+ g_assert_cmpint(a_s->drain_count, ==, 1);
70
+ g_assert_cmpint(b_s->drain_count, ==, 1);
71
+ g_assert_cmpint(backing_s->drain_count, ==, 1);
72
+
73
+ do_drain_end(BDRV_SUBTREE_DRAIN, bs_a);
74
+
75
+ g_assert_cmpint(bs_a->quiesce_counter, ==, 0);
76
+ g_assert_cmpint(bs_b->quiesce_counter, ==, 0);
77
+ g_assert_cmpint(backing->quiesce_counter, ==, 0);
78
+ g_assert_cmpint(a_s->drain_count, ==, 0);
79
+ g_assert_cmpint(b_s->drain_count, ==, 0);
80
+ g_assert_cmpint(backing_s->drain_count, ==, 0);
81
+
82
+ bdrv_unref(backing);
83
+ bdrv_unref(bs_a);
84
+ bdrv_unref(bs_b);
85
+ blk_unref(blk_a);
86
+ blk_unref(blk_b);
32
+}
87
+}
33
+
88
+
34
static int vxhs_init_and_ref(void)
89
35
{
90
typedef struct TestBlockJob {
36
if (vxhs_ref++ == 0) {
91
BlockJob common;
37
@@ -XXX,XX +XXX,XX @@ static const AIOCBInfo vxhs_aiocb_info = {
92
@@ -XXX,XX +XXX,XX @@ int main(int argc, char **argv)
38
* and is passed to QNIO. When QNIO completes the work,
93
test_quiesce_co_drain_subtree);
39
* it will be passed back through the callback.
94
40
*/
95
g_test_add_func("/bdrv-drain/nested", test_nested);
41
-static BlockAIOCB *vxhs_aio_rw(BlockDriverState *bs, int64_t sector_num,
96
+ g_test_add_func("/bdrv-drain/multiparent", test_multiparent);
42
- QEMUIOVector *qiov, int nb_sectors,
97
43
+static BlockAIOCB *vxhs_aio_rw(BlockDriverState *bs, uint64_t offset,
98
g_test_add_func("/bdrv-drain/blockjob/drain_all", test_blockjob_drain_all);
44
+ QEMUIOVector *qiov, uint64_t size,
99
g_test_add_func("/bdrv-drain/blockjob/drain", test_blockjob_drain);
45
BlockCompletionFunc *cb, void *opaque,
46
VDISKAIOCmd iodir)
47
{
48
VXHSAIOCB *acb = NULL;
49
BDRVVXHSState *s = bs->opaque;
50
- size_t size;
51
- uint64_t offset;
52
int iio_flags = 0;
53
int ret = 0;
54
void *dev_handle = s->vdisk_hostinfo.dev_handle;
55
56
- offset = sector_num * BDRV_SECTOR_SIZE;
57
- size = nb_sectors * BDRV_SECTOR_SIZE;
58
acb = qemu_aio_get(&vxhs_aiocb_info, bs, cb, opaque);
59
60
/*
61
@@ -XXX,XX +XXX,XX @@ static BlockAIOCB *vxhs_aio_rw(BlockDriverState *bs, int64_t sector_num,
62
switch (iodir) {
63
case VDISK_AIO_WRITE:
64
ret = iio_writev(dev_handle, acb, qiov->iov, qiov->niov,
65
- offset, (uint64_t)size, iio_flags);
66
+ offset, size, iio_flags);
67
break;
68
case VDISK_AIO_READ:
69
ret = iio_readv(dev_handle, acb, qiov->iov, qiov->niov,
70
- offset, (uint64_t)size, iio_flags);
71
+ offset, size, iio_flags);
72
break;
73
default:
74
trace_vxhs_aio_rw_invalid(iodir);
75
@@ -XXX,XX +XXX,XX @@ errout:
76
return NULL;
77
}
78
79
-static BlockAIOCB *vxhs_aio_readv(BlockDriverState *bs,
80
- int64_t sector_num, QEMUIOVector *qiov,
81
- int nb_sectors,
82
+static BlockAIOCB *vxhs_aio_preadv(BlockDriverState *bs,
83
+ uint64_t offset, uint64_t bytes,
84
+ QEMUIOVector *qiov, int flags,
85
BlockCompletionFunc *cb, void *opaque)
86
{
87
- return vxhs_aio_rw(bs, sector_num, qiov, nb_sectors, cb,
88
- opaque, VDISK_AIO_READ);
89
+ return vxhs_aio_rw(bs, offset, qiov, bytes, cb, opaque, VDISK_AIO_READ);
90
}
91
92
-static BlockAIOCB *vxhs_aio_writev(BlockDriverState *bs,
93
- int64_t sector_num, QEMUIOVector *qiov,
94
- int nb_sectors,
95
- BlockCompletionFunc *cb, void *opaque)
96
+static BlockAIOCB *vxhs_aio_pwritev(BlockDriverState *bs,
97
+ uint64_t offset, uint64_t bytes,
98
+ QEMUIOVector *qiov, int flags,
99
+ BlockCompletionFunc *cb, void *opaque)
100
{
101
- return vxhs_aio_rw(bs, sector_num, qiov, nb_sectors,
102
- cb, opaque, VDISK_AIO_WRITE);
103
+ return vxhs_aio_rw(bs, offset, qiov, bytes, cb, opaque, VDISK_AIO_WRITE);
104
}
105
106
static void vxhs_close(BlockDriverState *bs)
107
@@ -XXX,XX +XXX,XX @@ static BlockDriver bdrv_vxhs = {
108
.instance_size = sizeof(BDRVVXHSState),
109
.bdrv_file_open = vxhs_open,
110
.bdrv_parse_filename = vxhs_parse_filename,
111
+ .bdrv_refresh_limits = vxhs_refresh_limits,
112
.bdrv_close = vxhs_close,
113
.bdrv_getlength = vxhs_getlength,
114
- .bdrv_aio_readv = vxhs_aio_readv,
115
- .bdrv_aio_writev = vxhs_aio_writev,
116
+ .bdrv_aio_preadv = vxhs_aio_preadv,
117
+ .bdrv_aio_pwritev = vxhs_aio_pwritev,
118
};
119
120
static void bdrv_vxhs_init(void)
121
--
100
--
122
2.13.6
101
2.13.6
123
102
124
103
diff view generated by jsdifflib
1
From: Max Reitz <mreitz@redhat.com>
1
We need to remember how many of the drain sections in which a node is
2
2
were recursive (i.e. subtree drain rather than node drain), so that they
3
This flag signifies that a write request will not change the visible
3
can be correctly applied when children are added or removed during the
4
disk content. With this flag set, it is sufficient to have the
4
drained section.
5
BLK_PERM_WRITE_UNCHANGED permission instead of BLK_PERM_WRITE.
5
6
6
With this change, it is safe to modify the graph even inside a
7
Signed-off-by: Max Reitz <mreitz@redhat.com>
7
bdrv_subtree_drained_begin/end() section.
8
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
8
9
Reviewed-by: Alberto Garcia <berto@igalia.com>
9
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
10
Message-id: 20180421132929.21610-4-mreitz@redhat.com
11
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
12
Signed-off-by: Max Reitz <mreitz@redhat.com>
13
---
10
---
14
include/block/block.h | 6 +++++-
11
include/block/block.h | 2 --
15
block/io.c | 6 +++++-
12
include/block/block_int.h | 5 +++++
16
2 files changed, 10 insertions(+), 2 deletions(-)
13
block.c | 32 +++++++++++++++++++++++++++++---
14
block/io.c | 28 ++++++++++++++++++++++++----
15
4 files changed, 58 insertions(+), 9 deletions(-)
17
16
18
diff --git a/include/block/block.h b/include/block/block.h
17
diff --git a/include/block/block.h b/include/block/block.h
19
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
20
--- a/include/block/block.h
19
--- a/include/block/block.h
21
+++ b/include/block/block.h
20
+++ b/include/block/block.h
22
@@ -XXX,XX +XXX,XX @@ typedef enum {
21
@@ -XXX,XX +XXX,XX @@ void bdrv_drained_begin(BlockDriverState *bs);
23
BDRV_REQ_FUA = 0x10,
22
/**
24
BDRV_REQ_WRITE_COMPRESSED = 0x20,
23
* Like bdrv_drained_begin, but recursively begins a quiesced section for
25
24
* exclusive access to all child nodes as well.
26
+ /* Signifies that this write request will not change the visible disk
25
- *
27
+ * content. */
26
- * Graph changes are not allowed during a subtree drain section.
28
+ BDRV_REQ_WRITE_UNCHANGED = 0x40,
27
*/
29
+
28
void bdrv_subtree_drained_begin(BlockDriverState *bs);
30
/* Mask of valid flags */
29
31
- BDRV_REQ_MASK = 0x3f,
30
diff --git a/include/block/block_int.h b/include/block/block_int.h
32
+ BDRV_REQ_MASK = 0x7f,
31
index XXXXXXX..XXXXXXX 100644
33
} BdrvRequestFlags;
32
--- a/include/block/block_int.h
34
33
+++ b/include/block/block_int.h
35
typedef struct BlockSizes {
34
@@ -XXX,XX +XXX,XX @@ struct BlockDriverState {
35
36
/* Accessed with atomic ops. */
37
int quiesce_counter;
38
+ int recursive_quiesce_counter;
39
+
40
unsigned int write_gen; /* Current data generation */
41
42
/* Protected by reqs_lock. */
43
@@ -XXX,XX +XXX,XX @@ int coroutine_fn bdrv_co_pwritev(BdrvChild *child,
44
int64_t offset, unsigned int bytes, QEMUIOVector *qiov,
45
BdrvRequestFlags flags);
46
47
+void bdrv_apply_subtree_drain(BdrvChild *child, BlockDriverState *new_parent);
48
+void bdrv_unapply_subtree_drain(BdrvChild *child, BlockDriverState *old_parent);
49
+
50
int get_tmp_filename(char *filename, int size);
51
BlockDriver *bdrv_probe_all(const uint8_t *buf, int buf_size,
52
const char *filename);
53
diff --git a/block.c b/block.c
54
index XXXXXXX..XXXXXXX 100644
55
--- a/block.c
56
+++ b/block.c
57
@@ -XXX,XX +XXX,XX @@ static void bdrv_child_cb_drained_end(BdrvChild *child)
58
bdrv_drained_end(bs);
59
}
60
61
+static void bdrv_child_cb_attach(BdrvChild *child)
62
+{
63
+ BlockDriverState *bs = child->opaque;
64
+ bdrv_apply_subtree_drain(child, bs);
65
+}
66
+
67
+static void bdrv_child_cb_detach(BdrvChild *child)
68
+{
69
+ BlockDriverState *bs = child->opaque;
70
+ bdrv_unapply_subtree_drain(child, bs);
71
+}
72
+
73
static int bdrv_child_cb_inactivate(BdrvChild *child)
74
{
75
BlockDriverState *bs = child->opaque;
76
@@ -XXX,XX +XXX,XX @@ const BdrvChildRole child_file = {
77
.inherit_options = bdrv_inherited_options,
78
.drained_begin = bdrv_child_cb_drained_begin,
79
.drained_end = bdrv_child_cb_drained_end,
80
+ .attach = bdrv_child_cb_attach,
81
+ .detach = bdrv_child_cb_detach,
82
.inactivate = bdrv_child_cb_inactivate,
83
};
84
85
@@ -XXX,XX +XXX,XX @@ const BdrvChildRole child_format = {
86
.inherit_options = bdrv_inherited_fmt_options,
87
.drained_begin = bdrv_child_cb_drained_begin,
88
.drained_end = bdrv_child_cb_drained_end,
89
+ .attach = bdrv_child_cb_attach,
90
+ .detach = bdrv_child_cb_detach,
91
.inactivate = bdrv_child_cb_inactivate,
92
};
93
94
@@ -XXX,XX +XXX,XX @@ static void bdrv_backing_attach(BdrvChild *c)
95
parent->backing_blocker);
96
bdrv_op_unblock(backing_hd, BLOCK_OP_TYPE_BACKUP_TARGET,
97
parent->backing_blocker);
98
+
99
+ bdrv_child_cb_attach(c);
100
}
101
102
static void bdrv_backing_detach(BdrvChild *c)
103
@@ -XXX,XX +XXX,XX @@ static void bdrv_backing_detach(BdrvChild *c)
104
bdrv_op_unblock_all(c->bs, parent->backing_blocker);
105
error_free(parent->backing_blocker);
106
parent->backing_blocker = NULL;
107
+
108
+ bdrv_child_cb_detach(c);
109
}
110
111
/*
112
@@ -XXX,XX +XXX,XX @@ static void bdrv_replace_child_noperm(BdrvChild *child,
113
assert(bdrv_get_aio_context(old_bs) == bdrv_get_aio_context(new_bs));
114
}
115
if (old_bs) {
116
+ /* Detach first so that the recursive drain sections coming from @child
117
+ * are already gone and we only end the drain sections that came from
118
+ * elsewhere. */
119
+ if (child->role->detach) {
120
+ child->role->detach(child);
121
+ }
122
if (old_bs->quiesce_counter && child->role->drained_end) {
123
for (i = 0; i < old_bs->quiesce_counter; i++) {
124
child->role->drained_end(child);
125
}
126
}
127
- if (child->role->detach) {
128
- child->role->detach(child);
129
- }
130
QLIST_REMOVE(child, next_parent);
131
}
132
133
@@ -XXX,XX +XXX,XX @@ static void bdrv_replace_child_noperm(BdrvChild *child,
134
}
135
}
136
137
+ /* Attach only after starting new drained sections, so that recursive
138
+ * drain sections coming from @child don't get an extra .drained_begin
139
+ * callback. */
140
if (child->role->attach) {
141
child->role->attach(child);
142
}
36
diff --git a/block/io.c b/block/io.c
143
diff --git a/block/io.c b/block/io.c
37
index XXXXXXX..XXXXXXX 100644
144
index XXXXXXX..XXXXXXX 100644
38
--- a/block/io.c
145
--- a/block/io.c
39
+++ b/block/io.c
146
+++ b/block/io.c
40
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn bdrv_aligned_pwritev(BdrvChild *child,
147
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn bdrv_co_yield_to_drain(BlockDriverState *bs,
41
assert(!waited || !req->serialising);
148
assert(data.done);
42
assert(req->overlap_offset <= offset);
149
}
43
assert(offset + bytes <= req->overlap_offset + req->overlap_bytes);
150
44
- assert(child->perm & BLK_PERM_WRITE);
151
-static void bdrv_do_drained_begin(BlockDriverState *bs, bool recursive,
45
+ if (flags & BDRV_REQ_WRITE_UNCHANGED) {
152
- BdrvChild *parent)
46
+ assert(child->perm & (BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE));
153
+void bdrv_do_drained_begin(BlockDriverState *bs, bool recursive,
47
+ } else {
154
+ BdrvChild *parent)
48
+ assert(child->perm & BLK_PERM_WRITE);
155
{
156
BdrvChild *child, *next;
157
158
@@ -XXX,XX +XXX,XX @@ static void bdrv_do_drained_begin(BlockDriverState *bs, bool recursive,
159
bdrv_drain_recurse(bs);
160
161
if (recursive) {
162
+ bs->recursive_quiesce_counter++;
163
QLIST_FOREACH_SAFE(child, &bs->children, next, next) {
164
bdrv_do_drained_begin(child->bs, true, child);
165
}
166
@@ -XXX,XX +XXX,XX @@ void bdrv_subtree_drained_begin(BlockDriverState *bs)
167
bdrv_do_drained_begin(bs, true, NULL);
168
}
169
170
-static void bdrv_do_drained_end(BlockDriverState *bs, bool recursive,
171
- BdrvChild *parent)
172
+void bdrv_do_drained_end(BlockDriverState *bs, bool recursive,
173
+ BdrvChild *parent)
174
{
175
BdrvChild *child, *next;
176
int old_quiesce_counter;
177
@@ -XXX,XX +XXX,XX @@ static void bdrv_do_drained_end(BlockDriverState *bs, bool recursive,
178
}
179
180
if (recursive) {
181
+ bs->recursive_quiesce_counter--;
182
QLIST_FOREACH_SAFE(child, &bs->children, next, next) {
183
bdrv_do_drained_end(child->bs, true, child);
184
}
185
@@ -XXX,XX +XXX,XX @@ void bdrv_subtree_drained_end(BlockDriverState *bs)
186
bdrv_do_drained_end(bs, true, NULL);
187
}
188
189
+void bdrv_apply_subtree_drain(BdrvChild *child, BlockDriverState *new_parent)
190
+{
191
+ int i;
192
+
193
+ for (i = 0; i < new_parent->recursive_quiesce_counter; i++) {
194
+ bdrv_do_drained_begin(child->bs, true, child);
49
+ }
195
+ }
50
assert(end_sector <= bs->total_sectors || child->perm & BLK_PERM_RESIZE);
196
+}
51
197
+
52
ret = notifier_with_return_list_notify(&bs->before_write_notifiers, req);
198
+void bdrv_unapply_subtree_drain(BdrvChild *child, BlockDriverState *old_parent)
199
+{
200
+ int i;
201
+
202
+ for (i = 0; i < old_parent->recursive_quiesce_counter; i++) {
203
+ bdrv_do_drained_end(child->bs, true, child);
204
+ }
205
+}
206
+
207
/*
208
* Wait for pending requests to complete on a single BlockDriverState subtree,
209
* and suspend block driver's internal I/O until next request arrives.
53
--
210
--
54
2.13.6
211
2.13.6
55
212
56
213
diff view generated by jsdifflib
1
From: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
2
3
blk_get_aio_context verifies if BlockDriverState bs is not NULL,
4
return bdrv_get_aio_context(bs) if true or qemu_get_aio_context()
5
otherwise. However, bdrv_get_aio_context from block.c already does
6
this verification itself, also returning qemu_get_aio_context()
7
if bs is NULL:
8
9
AioContext *bdrv_get_aio_context(BlockDriverState *bs)
10
{
11
return bs ? bs->aio_context : qemu_get_aio_context();
12
}
13
14
This patch simplifies blk_get_aio_context to simply call
15
bdrv_get_aio_context instead of replicating the same logic.
16
17
Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
18
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
19
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
1
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
20
---
2
---
21
block/block-backend.c | 8 +-------
3
tests/test-bdrv-drain.c | 80 +++++++++++++++++++++++++++++++++++++++++++++++++
22
1 file changed, 1 insertion(+), 7 deletions(-)
4
1 file changed, 80 insertions(+)
23
5
24
diff --git a/block/block-backend.c b/block/block-backend.c
6
diff --git a/tests/test-bdrv-drain.c b/tests/test-bdrv-drain.c
25
index XXXXXXX..XXXXXXX 100644
7
index XXXXXXX..XXXXXXX 100644
26
--- a/block/block-backend.c
8
--- a/tests/test-bdrv-drain.c
27
+++ b/block/block-backend.c
9
+++ b/tests/test-bdrv-drain.c
28
@@ -XXX,XX +XXX,XX @@ void blk_op_unblock_all(BlockBackend *blk, Error *reason)
10
@@ -XXX,XX +XXX,XX @@ static void test_multiparent(void)
29
11
blk_unref(blk_b);
30
AioContext *blk_get_aio_context(BlockBackend *blk)
31
{
32
- BlockDriverState *bs = blk_bs(blk);
33
-
34
- if (bs) {
35
- return bdrv_get_aio_context(bs);
36
- } else {
37
- return qemu_get_aio_context();
38
- }
39
+ return bdrv_get_aio_context(blk_bs(blk));
40
}
12
}
41
13
42
static AioContext *blk_aiocb_get_aio_context(BlockAIOCB *acb)
14
+static void test_graph_change(void)
15
+{
16
+ BlockBackend *blk_a, *blk_b;
17
+ BlockDriverState *bs_a, *bs_b, *backing;
18
+ BDRVTestState *a_s, *b_s, *backing_s;
19
+
20
+ blk_a = blk_new(BLK_PERM_ALL, BLK_PERM_ALL);
21
+ bs_a = bdrv_new_open_driver(&bdrv_test, "test-node-a", BDRV_O_RDWR,
22
+ &error_abort);
23
+ a_s = bs_a->opaque;
24
+ blk_insert_bs(blk_a, bs_a, &error_abort);
25
+
26
+ blk_b = blk_new(BLK_PERM_ALL, BLK_PERM_ALL);
27
+ bs_b = bdrv_new_open_driver(&bdrv_test, "test-node-b", BDRV_O_RDWR,
28
+ &error_abort);
29
+ b_s = bs_b->opaque;
30
+ blk_insert_bs(blk_b, bs_b, &error_abort);
31
+
32
+ backing = bdrv_new_open_driver(&bdrv_test, "backing", 0, &error_abort);
33
+ backing_s = backing->opaque;
34
+ bdrv_set_backing_hd(bs_a, backing, &error_abort);
35
+
36
+ g_assert_cmpint(bs_a->quiesce_counter, ==, 0);
37
+ g_assert_cmpint(bs_b->quiesce_counter, ==, 0);
38
+ g_assert_cmpint(backing->quiesce_counter, ==, 0);
39
+ g_assert_cmpint(a_s->drain_count, ==, 0);
40
+ g_assert_cmpint(b_s->drain_count, ==, 0);
41
+ g_assert_cmpint(backing_s->drain_count, ==, 0);
42
+
43
+ do_drain_begin(BDRV_SUBTREE_DRAIN, bs_a);
44
+ do_drain_begin(BDRV_SUBTREE_DRAIN, bs_a);
45
+ do_drain_begin(BDRV_SUBTREE_DRAIN, bs_a);
46
+ do_drain_begin(BDRV_SUBTREE_DRAIN, bs_b);
47
+ do_drain_begin(BDRV_SUBTREE_DRAIN, bs_b);
48
+
49
+ bdrv_set_backing_hd(bs_b, backing, &error_abort);
50
+ g_assert_cmpint(bs_a->quiesce_counter, ==, 5);
51
+ g_assert_cmpint(bs_b->quiesce_counter, ==, 5);
52
+ g_assert_cmpint(backing->quiesce_counter, ==, 5);
53
+ g_assert_cmpint(a_s->drain_count, ==, 5);
54
+ g_assert_cmpint(b_s->drain_count, ==, 5);
55
+ g_assert_cmpint(backing_s->drain_count, ==, 5);
56
+
57
+ bdrv_set_backing_hd(bs_b, NULL, &error_abort);
58
+ g_assert_cmpint(bs_a->quiesce_counter, ==, 3);
59
+ g_assert_cmpint(bs_b->quiesce_counter, ==, 2);
60
+ g_assert_cmpint(backing->quiesce_counter, ==, 3);
61
+ g_assert_cmpint(a_s->drain_count, ==, 3);
62
+ g_assert_cmpint(b_s->drain_count, ==, 2);
63
+ g_assert_cmpint(backing_s->drain_count, ==, 3);
64
+
65
+ bdrv_set_backing_hd(bs_b, backing, &error_abort);
66
+ g_assert_cmpint(bs_a->quiesce_counter, ==, 5);
67
+ g_assert_cmpint(bs_b->quiesce_counter, ==, 5);
68
+ g_assert_cmpint(backing->quiesce_counter, ==, 5);
69
+ g_assert_cmpint(a_s->drain_count, ==, 5);
70
+ g_assert_cmpint(b_s->drain_count, ==, 5);
71
+ g_assert_cmpint(backing_s->drain_count, ==, 5);
72
+
73
+ do_drain_end(BDRV_SUBTREE_DRAIN, bs_b);
74
+ do_drain_end(BDRV_SUBTREE_DRAIN, bs_b);
75
+ do_drain_end(BDRV_SUBTREE_DRAIN, bs_a);
76
+ do_drain_end(BDRV_SUBTREE_DRAIN, bs_a);
77
+ do_drain_end(BDRV_SUBTREE_DRAIN, bs_a);
78
+
79
+ g_assert_cmpint(bs_a->quiesce_counter, ==, 0);
80
+ g_assert_cmpint(bs_b->quiesce_counter, ==, 0);
81
+ g_assert_cmpint(backing->quiesce_counter, ==, 0);
82
+ g_assert_cmpint(a_s->drain_count, ==, 0);
83
+ g_assert_cmpint(b_s->drain_count, ==, 0);
84
+ g_assert_cmpint(backing_s->drain_count, ==, 0);
85
+
86
+ bdrv_unref(backing);
87
+ bdrv_unref(bs_a);
88
+ bdrv_unref(bs_b);
89
+ blk_unref(blk_a);
90
+ blk_unref(blk_b);
91
+}
92
+
93
94
typedef struct TestBlockJob {
95
BlockJob common;
96
@@ -XXX,XX +XXX,XX @@ int main(int argc, char **argv)
97
98
g_test_add_func("/bdrv-drain/nested", test_nested);
99
g_test_add_func("/bdrv-drain/multiparent", test_multiparent);
100
+ g_test_add_func("/bdrv-drain/graph-change", test_graph_change);
101
102
g_test_add_func("/bdrv-drain/blockjob/drain_all", test_blockjob_drain_all);
103
g_test_add_func("/bdrv-drain/blockjob/drain", test_blockjob_drain);
43
--
104
--
44
2.13.6
105
2.13.6
45
106
46
107
diff view generated by jsdifflib
1
This gets us rid of more direct accesses to BlockJob fields from the
1
Since commit bde70715, base is the only node that is reopened in
2
job drivers.
2
commit_start(). This means that the code, which still involves an
3
explicit BlockReopenQueue, can now be simplified by using bdrv_reopen().
3
4
4
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
5
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
5
Reviewed-by: Eric Blake <eblake@redhat.com>
6
Reviewed-by: Fam Zheng <famz@redhat.com>
6
Reviewed-by: Max Reitz <mreitz@redhat.com>
7
Reviewed-by: John Snow <jsnow@redhat.com>
8
---
7
---
9
include/block/blockjob_int.h | 8 ++++++++
8
block/commit.c | 8 +-------
10
block/backup.c | 18 +++++++-----------
9
1 file changed, 1 insertion(+), 7 deletions(-)
11
block/commit.c | 4 ++--
12
block/mirror.c | 5 +----
13
block/stream.c | 4 ++--
14
blockjob.c | 9 +++++++++
15
6 files changed, 29 insertions(+), 19 deletions(-)
16
10
17
diff --git a/include/block/blockjob_int.h b/include/block/blockjob_int.h
18
index XXXXXXX..XXXXXXX 100644
19
--- a/include/block/blockjob_int.h
20
+++ b/include/block/blockjob_int.h
21
@@ -XXX,XX +XXX,XX @@ void block_job_sleep_ns(BlockJob *job, int64_t ns);
22
void block_job_yield(BlockJob *job);
23
24
/**
25
+ * block_job_ratelimit_get_delay:
26
+ *
27
+ * Calculate and return delay for the next request in ns. See the documentation
28
+ * of ratelimit_calculate_delay() for details.
29
+ */
30
+int64_t block_job_ratelimit_get_delay(BlockJob *job, uint64_t n);
31
+
32
+/**
33
* block_job_early_fail:
34
* @bs: The block device.
35
*
36
diff --git a/block/backup.c b/block/backup.c
37
index XXXXXXX..XXXXXXX 100644
38
--- a/block/backup.c
39
+++ b/block/backup.c
40
@@ -XXX,XX +XXX,XX @@ static void backup_complete(BlockJob *job, void *opaque)
41
42
static bool coroutine_fn yield_and_check(BackupBlockJob *job)
43
{
44
+ uint64_t delay_ns;
45
+
46
if (block_job_is_cancelled(&job->common)) {
47
return true;
48
}
49
50
- /* we need to yield so that bdrv_drain_all() returns.
51
- * (without, VM does not reboot)
52
- */
53
- if (job->common.speed) {
54
- uint64_t delay_ns = ratelimit_calculate_delay(&job->common.limit,
55
- job->bytes_read);
56
- job->bytes_read = 0;
57
- block_job_sleep_ns(&job->common, delay_ns);
58
- } else {
59
- block_job_sleep_ns(&job->common, 0);
60
- }
61
+ /* We need to yield even for delay_ns = 0 so that bdrv_drain_all() can
62
+ * return. Without a yield, the VM would not reboot. */
63
+ delay_ns = block_job_ratelimit_get_delay(&job->common, job->bytes_read);
64
+ job->bytes_read = 0;
65
+ block_job_sleep_ns(&job->common, delay_ns);
66
67
if (block_job_is_cancelled(&job->common)) {
68
return true;
69
diff --git a/block/commit.c b/block/commit.c
11
diff --git a/block/commit.c b/block/commit.c
70
index XXXXXXX..XXXXXXX 100644
12
index XXXXXXX..XXXXXXX 100644
71
--- a/block/commit.c
13
--- a/block/commit.c
72
+++ b/block/commit.c
14
+++ b/block/commit.c
73
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn commit_run(void *opaque)
15
@@ -XXX,XX +XXX,XX @@ void commit_start(const char *job_id, BlockDriverState *bs,
74
/* Publish progress */
16
const char *filter_node_name, Error **errp)
75
block_job_progress_update(&s->common, n);
76
77
- if (copy && s->common.speed) {
78
- delay_ns = ratelimit_calculate_delay(&s->common.limit, n);
79
+ if (copy) {
80
+ delay_ns = block_job_ratelimit_get_delay(&s->common, n);
81
} else {
82
delay_ns = 0;
83
}
84
diff --git a/block/mirror.c b/block/mirror.c
85
index XXXXXXX..XXXXXXX 100644
86
--- a/block/mirror.c
87
+++ b/block/mirror.c
88
@@ -XXX,XX +XXX,XX @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
89
assert(io_bytes);
90
offset += io_bytes;
91
nb_chunks -= DIV_ROUND_UP(io_bytes, s->granularity);
92
- if (s->common.speed) {
93
- delay_ns = ratelimit_calculate_delay(&s->common.limit,
94
- io_bytes_acct);
95
- }
96
+ delay_ns = block_job_ratelimit_get_delay(&s->common, io_bytes_acct);
97
}
98
return delay_ns;
99
}
100
diff --git a/block/stream.c b/block/stream.c
101
index XXXXXXX..XXXXXXX 100644
102
--- a/block/stream.c
103
+++ b/block/stream.c
104
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn stream_run(void *opaque)
105
106
/* Publish progress */
107
block_job_progress_update(&s->common, n);
108
- if (copy && s->common.speed) {
109
- delay_ns = ratelimit_calculate_delay(&s->common.limit, n);
110
+ if (copy) {
111
+ delay_ns = block_job_ratelimit_get_delay(&s->common, n);
112
} else {
113
delay_ns = 0;
114
}
115
diff --git a/blockjob.c b/blockjob.c
116
index XXXXXXX..XXXXXXX 100644
117
--- a/blockjob.c
118
+++ b/blockjob.c
119
@@ -XXX,XX +XXX,XX @@ void block_job_set_speed(BlockJob *job, int64_t speed, Error **errp)
120
block_job_enter_cond(job, block_job_timer_pending);
121
}
122
123
+int64_t block_job_ratelimit_get_delay(BlockJob *job, uint64_t n)
124
+{
125
+ if (!job->speed) {
126
+ return 0;
127
+ }
128
+
129
+ return ratelimit_calculate_delay(&job->limit, n);
130
+}
131
+
132
void block_job_complete(BlockJob *job, Error **errp)
133
{
17
{
134
/* Should not be reachable via external interface for internal jobs */
18
CommitBlockJob *s;
19
- BlockReopenQueue *reopen_queue = NULL;
20
int orig_base_flags;
21
BlockDriverState *iter;
22
BlockDriverState *commit_top_bs = NULL;
23
@@ -XXX,XX +XXX,XX @@ void commit_start(const char *job_id, BlockDriverState *bs,
24
/* convert base to r/w, if necessary */
25
orig_base_flags = bdrv_get_flags(base);
26
if (!(orig_base_flags & BDRV_O_RDWR)) {
27
- reopen_queue = bdrv_reopen_queue(reopen_queue, base, NULL,
28
- orig_base_flags | BDRV_O_RDWR);
29
- }
30
-
31
- if (reopen_queue) {
32
- bdrv_reopen_multiple(bdrv_get_aio_context(bs), reopen_queue, &local_err);
33
+ bdrv_reopen(base, orig_base_flags | BDRV_O_RDWR, &local_err);
34
if (local_err != NULL) {
35
error_propagate(errp, local_err);
36
goto fail;
135
--
37
--
136
2.13.6
38
2.13.6
137
39
138
40
diff view generated by jsdifflib
1
From: Eric Blake <eblake@redhat.com>
1
The bdrv_reopen*() implementation doesn't like it if the graph is
2
changed between queuing nodes for reopen and actually reopening them
3
(one of the reasons is that queuing can be recursive).
2
4
3
We are gradually moving away from sector-based interfaces, towards
5
So instead of draining the device only in bdrv_reopen_multiple(),
4
byte-based. Make the change for the last few sector-based callbacks
6
require that callers already drained all affected nodes, and assert this
5
in the null-co and null-aio drivers.
7
in bdrv_reopen_queue().
6
8
7
Note that since the null driver does nothing on writes, it trivially
9
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8
supports the BDRV_REQ_FUA flag (all writes have already landed to
10
Reviewed-by: Fam Zheng <famz@redhat.com>
9
the same bit-bucket without needing an extra flush call). Also, since
11
---
10
the null driver does just as well with byte-based requests, we can
12
block.c | 23 ++++++++++++++++-------
11
now avoid cycles wasted on read-modify-write by taking advantage of
13
block/replication.c | 6 ++++++
12
the block layer now defaulting the alignment to 1 instead of 512.
14
qemu-io-cmds.c | 3 +++
15
3 files changed, 25 insertions(+), 7 deletions(-)
13
16
14
Signed-off-by: Eric Blake <eblake@redhat.com>
17
diff --git a/block.c b/block.c
15
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
16
---
17
block/null.c | 45 +++++++++++++++++++++++----------------------
18
1 file changed, 23 insertions(+), 22 deletions(-)
19
20
diff --git a/block/null.c b/block/null.c
21
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
22
--- a/block/null.c
19
--- a/block.c
23
+++ b/block/null.c
20
+++ b/block.c
24
@@ -XXX,XX +XXX,XX @@ static int null_file_open(BlockDriverState *bs, QDict *options, int flags,
21
@@ -XXX,XX +XXX,XX @@ BlockDriverState *bdrv_open(const char *filename, const char *reference,
22
* returns a pointer to bs_queue, which is either the newly allocated
23
* bs_queue, or the existing bs_queue being used.
24
*
25
+ * bs must be drained between bdrv_reopen_queue() and bdrv_reopen_multiple().
26
*/
27
static BlockReopenQueue *bdrv_reopen_queue_child(BlockReopenQueue *bs_queue,
28
BlockDriverState *bs,
29
@@ -XXX,XX +XXX,XX @@ static BlockReopenQueue *bdrv_reopen_queue_child(BlockReopenQueue *bs_queue,
30
BdrvChild *child;
31
QDict *old_options, *explicit_options;
32
33
+ /* Make sure that the caller remembered to use a drained section. This is
34
+ * important to avoid graph changes between the recursive queuing here and
35
+ * bdrv_reopen_multiple(). */
36
+ assert(bs->quiesce_counter > 0);
37
+
38
if (bs_queue == NULL) {
39
bs_queue = g_new0(BlockReopenQueue, 1);
40
QSIMPLEQ_INIT(bs_queue);
41
@@ -XXX,XX +XXX,XX @@ BlockReopenQueue *bdrv_reopen_queue(BlockReopenQueue *bs_queue,
42
* If all devices prepare successfully, then the changes are committed
43
* to all devices.
44
*
45
+ * All affected nodes must be drained between bdrv_reopen_queue() and
46
+ * bdrv_reopen_multiple().
47
*/
48
int bdrv_reopen_multiple(AioContext *ctx, BlockReopenQueue *bs_queue, Error **errp)
49
{
50
@@ -XXX,XX +XXX,XX @@ int bdrv_reopen_multiple(AioContext *ctx, BlockReopenQueue *bs_queue, Error **er
51
52
assert(bs_queue != NULL);
53
54
- aio_context_release(ctx);
55
- bdrv_drain_all_begin();
56
- aio_context_acquire(ctx);
57
-
58
QSIMPLEQ_FOREACH(bs_entry, bs_queue, entry) {
59
+ assert(bs_entry->state.bs->quiesce_counter > 0);
60
if (bdrv_reopen_prepare(&bs_entry->state, bs_queue, &local_err)) {
61
error_propagate(errp, local_err);
62
goto cleanup;
63
@@ -XXX,XX +XXX,XX @@ cleanup:
25
}
64
}
26
s->read_zeroes = qemu_opt_get_bool(opts, NULL_OPT_ZEROES, false);
65
g_free(bs_queue);
27
qemu_opts_del(opts);
66
28
+ bs->supported_write_flags = BDRV_REQ_FUA;
67
- bdrv_drain_all_end();
68
-
29
return ret;
69
return ret;
30
}
70
}
31
71
32
@@ -XXX,XX +XXX,XX @@ static coroutine_fn int null_co_common(BlockDriverState *bs)
72
@@ -XXX,XX +XXX,XX @@ int bdrv_reopen(BlockDriverState *bs, int bdrv_flags, Error **errp)
33
return 0;
73
{
74
int ret = -1;
75
Error *local_err = NULL;
76
- BlockReopenQueue *queue = bdrv_reopen_queue(NULL, bs, NULL, bdrv_flags);
77
+ BlockReopenQueue *queue;
78
79
+ bdrv_subtree_drained_begin(bs);
80
+
81
+ queue = bdrv_reopen_queue(NULL, bs, NULL, bdrv_flags);
82
ret = bdrv_reopen_multiple(bdrv_get_aio_context(bs), queue, &local_err);
83
if (local_err != NULL) {
84
error_propagate(errp, local_err);
85
}
86
+
87
+ bdrv_subtree_drained_end(bs);
88
+
89
return ret;
34
}
90
}
35
91
36
-static coroutine_fn int null_co_readv(BlockDriverState *bs,
92
diff --git a/block/replication.c b/block/replication.c
37
- int64_t sector_num, int nb_sectors,
93
index XXXXXXX..XXXXXXX 100644
38
- QEMUIOVector *qiov)
94
--- a/block/replication.c
39
+static coroutine_fn int null_co_preadv(BlockDriverState *bs,
95
+++ b/block/replication.c
40
+ uint64_t offset, uint64_t bytes,
96
@@ -XXX,XX +XXX,XX @@ static void reopen_backing_file(BlockDriverState *bs, bool writable,
41
+ QEMUIOVector *qiov, int flags)
97
new_secondary_flags = s->orig_secondary_flags;
42
{
43
BDRVNullState *s = bs->opaque;
44
45
if (s->read_zeroes) {
46
- qemu_iovec_memset(qiov, 0, 0, nb_sectors * BDRV_SECTOR_SIZE);
47
+ qemu_iovec_memset(qiov, 0, 0, bytes);
48
}
98
}
49
99
50
return null_co_common(bs);
100
+ bdrv_subtree_drained_begin(s->hidden_disk->bs);
101
+ bdrv_subtree_drained_begin(s->secondary_disk->bs);
102
+
103
if (orig_hidden_flags != new_hidden_flags) {
104
reopen_queue = bdrv_reopen_queue(reopen_queue, s->hidden_disk->bs, NULL,
105
new_hidden_flags);
106
@@ -XXX,XX +XXX,XX @@ static void reopen_backing_file(BlockDriverState *bs, bool writable,
107
reopen_queue, &local_err);
108
error_propagate(errp, local_err);
109
}
110
+
111
+ bdrv_subtree_drained_end(s->hidden_disk->bs);
112
+ bdrv_subtree_drained_end(s->secondary_disk->bs);
51
}
113
}
52
114
53
-static coroutine_fn int null_co_writev(BlockDriverState *bs,
115
static void backup_job_cleanup(BlockDriverState *bs)
54
- int64_t sector_num, int nb_sectors,
116
diff --git a/qemu-io-cmds.c b/qemu-io-cmds.c
55
- QEMUIOVector *qiov)
117
index XXXXXXX..XXXXXXX 100644
56
+static coroutine_fn int null_co_pwritev(BlockDriverState *bs,
118
--- a/qemu-io-cmds.c
57
+ uint64_t offset, uint64_t bytes,
119
+++ b/qemu-io-cmds.c
58
+ QEMUIOVector *qiov, int flags)
120
@@ -XXX,XX +XXX,XX @@ static int reopen_f(BlockBackend *blk, int argc, char **argv)
59
{
121
opts = qopts ? qemu_opts_to_qdict(qopts, NULL) : NULL;
60
return null_co_common(bs);
122
qemu_opts_reset(&reopen_opts);
61
}
123
62
@@ -XXX,XX +XXX,XX @@ static inline BlockAIOCB *null_aio_common(BlockDriverState *bs,
124
+ bdrv_subtree_drained_begin(bs);
63
return &acb->common;
125
brq = bdrv_reopen_queue(NULL, bs, opts, flags);
64
}
126
bdrv_reopen_multiple(bdrv_get_aio_context(bs), brq, &local_err);
65
127
+ bdrv_subtree_drained_end(bs);
66
-static BlockAIOCB *null_aio_readv(BlockDriverState *bs,
128
+
67
- int64_t sector_num, QEMUIOVector *qiov,
129
if (local_err) {
68
- int nb_sectors,
130
error_report_err(local_err);
69
- BlockCompletionFunc *cb,
131
} else {
70
- void *opaque)
71
+static BlockAIOCB *null_aio_preadv(BlockDriverState *bs,
72
+ uint64_t offset, uint64_t bytes,
73
+ QEMUIOVector *qiov, int flags,
74
+ BlockCompletionFunc *cb,
75
+ void *opaque)
76
{
77
BDRVNullState *s = bs->opaque;
78
79
if (s->read_zeroes) {
80
- qemu_iovec_memset(qiov, 0, 0, nb_sectors * BDRV_SECTOR_SIZE);
81
+ qemu_iovec_memset(qiov, 0, 0, bytes);
82
}
83
84
return null_aio_common(bs, cb, opaque);
85
}
86
87
-static BlockAIOCB *null_aio_writev(BlockDriverState *bs,
88
- int64_t sector_num, QEMUIOVector *qiov,
89
- int nb_sectors,
90
- BlockCompletionFunc *cb,
91
- void *opaque)
92
+static BlockAIOCB *null_aio_pwritev(BlockDriverState *bs,
93
+ uint64_t offset, uint64_t bytes,
94
+ QEMUIOVector *qiov, int flags,
95
+ BlockCompletionFunc *cb,
96
+ void *opaque)
97
{
98
return null_aio_common(bs, cb, opaque);
99
}
100
@@ -XXX,XX +XXX,XX @@ static BlockDriver bdrv_null_co = {
101
.bdrv_close = null_close,
102
.bdrv_getlength = null_getlength,
103
104
- .bdrv_co_readv = null_co_readv,
105
- .bdrv_co_writev = null_co_writev,
106
+ .bdrv_co_preadv = null_co_preadv,
107
+ .bdrv_co_pwritev = null_co_pwritev,
108
.bdrv_co_flush_to_disk = null_co_flush,
109
.bdrv_reopen_prepare = null_reopen_prepare,
110
111
@@ -XXX,XX +XXX,XX @@ static BlockDriver bdrv_null_aio = {
112
.bdrv_close = null_close,
113
.bdrv_getlength = null_getlength,
114
115
- .bdrv_aio_readv = null_aio_readv,
116
- .bdrv_aio_writev = null_aio_writev,
117
+ .bdrv_aio_preadv = null_aio_preadv,
118
+ .bdrv_aio_pwritev = null_aio_pwritev,
119
.bdrv_aio_flush = null_aio_flush,
120
.bdrv_reopen_prepare = null_reopen_prepare,
121
122
--
132
--
123
2.13.6
133
2.13.6
124
134
125
135
diff view generated by jsdifflib
Deleted patch
1
From: Alberto Garcia <berto@igalia.com>
2
1
3
Compressed clusters are not supposed to have the COPIED bit set, but
4
this is not made explicit in the specs, so let's document it.
5
6
Signed-off-by: Alberto Garcia <berto@igalia.com>
7
Message-id: 74552e1d6e858d3159cb0c0e188e80bc9248e337.1523376013.git.berto@igalia.com
8
Reviewed-by: Eric Blake <eblake@redhat.com>
9
Signed-off-by: Max Reitz <mreitz@redhat.com>
10
---
11
docs/interop/qcow2.txt | 8 ++++----
12
1 file changed, 4 insertions(+), 4 deletions(-)
13
14
diff --git a/docs/interop/qcow2.txt b/docs/interop/qcow2.txt
15
index XXXXXXX..XXXXXXX 100644
16
--- a/docs/interop/qcow2.txt
17
+++ b/docs/interop/qcow2.txt
18
@@ -XXX,XX +XXX,XX @@ L2 table entry:
19
62: 0 for standard clusters
20
1 for compressed clusters
21
22
- 63: 0 for a cluster that is unused or requires COW, 1 if its
23
- refcount is exactly one. This information is only accurate
24
- in L2 tables that are reachable from the active L1
25
- table.
26
+ 63: 0 for clusters that are unused, compressed or require COW.
27
+ 1 for standard clusters whose refcount is exactly one.
28
+ This information is only accurate in L2 tables
29
+ that are reachable from the active L1 table.
30
31
Standard Cluster Descriptor:
32
33
--
34
2.13.6
35
36
diff view generated by jsdifflib
Deleted patch
1
From: Alberto Garcia <berto@igalia.com>
2
1
3
We have just reduced the refcount cache size to the minimum unless
4
the user explicitly requests a larger one, so we have to update the
5
documentation to reflect this change.
6
7
Signed-off-by: Alberto Garcia <berto@igalia.com>
8
Message-id: c5f0bde23558dd9d33b21fffc76ac9953cc19c56.1523968389.git.berto@igalia.com
9
Reviewed-by: Eric Blake <eblake@redhat.com>
10
Signed-off-by: Max Reitz <mreitz@redhat.com>
11
---
12
docs/qcow2-cache.txt | 33 ++++++++++++++++-----------------
13
1 file changed, 16 insertions(+), 17 deletions(-)
14
15
diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt
16
index XXXXXXX..XXXXXXX 100644
17
--- a/docs/qcow2-cache.txt
18
+++ b/docs/qcow2-cache.txt
19
@@ -XXX,XX +XXX,XX @@ There are three options available, and all of them take bytes:
20
"refcount-cache-size": maximum size of the refcount block cache
21
"cache-size": maximum size of both caches combined
22
23
-There are two things that need to be taken into account:
24
+There are a few things that need to be taken into account:
25
26
- Both caches must have a size that is a multiple of the cluster size
27
(or the cache entry size: see "Using smaller cache sizes" below).
28
29
- - If you only set one of the options above, QEMU will automatically
30
- adjust the others so that the L2 cache is 4 times bigger than the
31
- refcount cache.
32
+ - The default L2 cache size is 8 clusters or 1MB (whichever is more),
33
+ and the minimum is 2 clusters (or 2 cache entries, see below).
34
35
-This means that these options are equivalent:
36
+ - The default (and minimum) refcount cache size is 4 clusters.
37
38
- -drive file=hd.qcow2,l2-cache-size=2097152
39
- -drive file=hd.qcow2,refcount-cache-size=524288
40
- -drive file=hd.qcow2,cache-size=2621440
41
+ - If only "cache-size" is specified then QEMU will assign as much
42
+ memory as possible to the L2 cache before increasing the refcount
43
+ cache size.
44
45
-The reason for this 1/4 ratio is to ensure that both caches cover the
46
-same amount of disk space. Note however that this is only valid with
47
-the default value of refcount_bits (16). If you are using a different
48
-value you might want to calculate both cache sizes yourself since QEMU
49
-will always use the same 1/4 ratio.
50
+Unlike L2 tables, refcount blocks are not used during normal I/O but
51
+only during allocations and internal snapshots. In most cases they are
52
+accessed sequentially (even during random guest I/O) so increasing the
53
+refcount cache size won't have any measurable effect in performance
54
+(this can change if you are using internal snapshots, so you may want
55
+to think about increasing the cache size if you use them heavily).
56
57
-It's also worth mentioning that there's no strict need for both caches
58
-to cover the same amount of disk space. The refcount cache is used
59
-much less often than the L2 cache, so it's perfectly reasonable to
60
-keep it small.
61
+Before QEMU 2.12 the refcount cache had a default size of 1/4 of the
62
+L2 cache size. This resulted in unnecessarily large caches, so now the
63
+refcount cache is as small as possible unless overridden by the user.
64
65
66
Using smaller cache entries
67
--
68
2.13.6
69
70
diff view generated by jsdifflib