1
The following changes since commit 9436e082de18b2fb2ceed2e9d1beef641ae64f23:
1
The following changes since commit 60205b71421cbc529ca60b12c79e0eeace007319:
2
2
3
MAINTAINERS: clarify some of the tags (2018-11-19 11:19:23 +0000)
3
Merge tag 'pull-aspeed-20220801' of https://github.com/legoater/qemu into staging (2022-08-01 13:55:11 -0700)
4
4
5
are available in the Git repository at:
5
are available in the Git repository at:
6
6
7
git://repo.or.cz/qemu/kevin.git tags/for-upstream
7
git://repo.or.cz/qemu/kevin.git tags/for-upstream
8
8
9
for you to fetch changes up to 6d0a4a0fb5c8f10c8eb68b52cfda0082b00ae963:
9
for you to fetch changes up to 21b1d974595b3986c68fe80a1f7e9b87886d4bae:
10
10
11
iotests: Test file-posix locking and reopen (2018-11-19 14:32:04 +0100)
11
main loop: add missing documentation links to GS/IO macros (2022-08-02 12:02:17 +0200)
12
12
13
----------------------------------------------------------------
13
----------------------------------------------------------------
14
Block layer patches:
14
Block layer patches
15
15
16
- file-posix: Fix shared permission locks after reopen
16
- libvduse: Coverity fixes
17
- block: Fix error path for failed .bdrv_reopen_prepare
17
- hd-geometry: Fix ignored bios-chs-trans setting
18
- qcow2: Catch invalid allocations when the image becomes too large
18
- io_uring: Fix compiler warning (missing #include)
19
- vvfat/fdc/nvme: Fix segfaults and leaks
19
- main loop: add missing documentation links to GS/IO macros
20
- qemu-iotests: Discard stderr when probing devices
20
21
21
----------------------------------------------------------------
22
----------------------------------------------------------------
22
Eric Blake (3):
23
Cole Robinson (1):
23
qcow2: Document some maximum size constraints
24
qemu-iotests: Discard stderr when probing devices
24
qcow2: Don't allow overflow during cluster allocation
25
iotests: Add new test 220 for max compressed cluster offset
26
25
27
Kevin Wolf (1):
26
Emanuele Giuseppe Esposito (1):
28
vvfat: Fix memory leak
27
main loop: add missing documentation links to GS/IO macros
29
28
30
Li Qiang (1):
29
Jinhao Fan (1):
31
nvme: fix oob access issue(CVE-2018-16847)
30
block/io_uring: add missing include file
32
31
33
Mark Cave-Ayland (1):
32
Lev Kujawski (1):
34
fdc: fix segfault in fdctrl_stop_transfer() when DMA is disabled
33
hw/block/hd-geometry: Do not override specified bios-chs-trans
35
34
36
Max Reitz (3):
35
Xie Yongji (3):
37
block: Always abort reopen after prepare succeeded
36
libvduse: Fix the incorrect function name
38
file-posix: Fix shared locks on reopen commit
37
libvduse: Replace strcpy() with strncpy()
39
iotests: Test file-posix locking and reopen
38
libvduse: Pass positive value to strerror()
40
39
41
docs/interop/qcow2.txt | 38 +++++++++++++++++-
40
include/qemu/main-loop.h | 18 +++++++++++++++---
42
block/qcow2.h | 6 +++
41
block/io_uring.c | 1 +
43
block.c | 12 ++++++
42
hw/block/hd-geometry.c | 7 ++++++-
44
block/file-posix.c | 2 +-
43
subprojects/libvduse/libvduse.c | 13 +++++++------
45
block/qcow2-refcount.c | 20 ++++++----
44
tests/qemu-iotests/common.rc | 4 ++--
46
block/vvfat.c | 6 +--
45
5 files changed, 31 insertions(+), 12 deletions(-)
47
hw/block/fdc.c | 2 +-
48
hw/block/nvme.c | 7 ++++
49
tests/qemu-iotests/182 | 71 ++++++++++++++++++++++++++++++++++
50
tests/qemu-iotests/182.out | 9 +++++
51
tests/qemu-iotests/220 | 96 ++++++++++++++++++++++++++++++++++++++++++++++
52
tests/qemu-iotests/220.out | 54 ++++++++++++++++++++++++++
53
tests/qemu-iotests/group | 1 +
54
13 files changed, 310 insertions(+), 14 deletions(-)
55
create mode 100755 tests/qemu-iotests/220
56
create mode 100644 tests/qemu-iotests/220.out
57
diff view generated by jsdifflib
1
From: Max Reitz <mreitz@redhat.com>
1
From: Jinhao Fan <fanjinhao21s@ict.ac.cn>
2
2
3
Signed-off-by: Max Reitz <mreitz@redhat.com>
3
The commit "Use io_uring_register_ring_fd() to skip fd operations" uses
4
Reviewed-by: Alberto Garcia <berto@igalia.com>
4
warn_report but did not include the header file "qemu/error-report.h".
5
This causes "error: implicit declaration of function ‘warn_report’".
6
Include this header file.
7
8
Fixes: e2848bc574 ("Use io_uring_register_ring_fd() to skip fd operations")
9
Signed-off-by: Jinhao Fan <fanjinhao21s@ict.ac.cn>
10
Message-Id: <20220721065645.577404-1-fanjinhao21s@ict.ac.cn>
11
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
5
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
12
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
6
---
13
---
7
tests/qemu-iotests/182 | 71 ++++++++++++++++++++++++++++++++++++++
14
block/io_uring.c | 1 +
8
tests/qemu-iotests/182.out | 9 +++++
15
1 file changed, 1 insertion(+)
9
2 files changed, 80 insertions(+)
10
16
11
diff --git a/tests/qemu-iotests/182 b/tests/qemu-iotests/182
17
diff --git a/block/io_uring.c b/block/io_uring.c
12
index XXXXXXX..XXXXXXX 100755
13
--- a/tests/qemu-iotests/182
14
+++ b/tests/qemu-iotests/182
15
@@ -XXX,XX +XXX,XX @@ status=1    # failure is the default!
16
_cleanup()
17
{
18
_cleanup_test_img
19
+ rm -f "$TEST_IMG.overlay"
20
}
21
trap "_cleanup; exit \$status" 0 1 2 3 15
22
23
@@ -XXX,XX +XXX,XX @@ echo 'quit' | $QEMU -nographic -monitor stdio \
24
25
_cleanup_qemu
26
27
+echo
28
+echo '=== Testing reopen ==='
29
+echo
30
+
31
+# This tests that reopening does not unshare any permissions it should
32
+# not unshare
33
+# (There was a bug where reopening shared exactly the opposite of the
34
+# permissions it was supposed to share)
35
+
36
+_launch_qemu
37
+
38
+_send_qemu_cmd $QEMU_HANDLE \
39
+ "{'execute': 'qmp_capabilities'}" \
40
+ 'return'
41
+
42
+# Open the image without any format layer (we are not going to access
43
+# it, so that is fine)
44
+# This should keep all permissions shared.
45
+success_or_failure=y _send_qemu_cmd $QEMU_HANDLE \
46
+ "{'execute': 'blockdev-add',
47
+ 'arguments': {
48
+ 'node-name': 'node0',
49
+ 'driver': 'file',
50
+ 'filename': '$TEST_IMG',
51
+ 'locking': 'on'
52
+ } }" \
53
+ 'return' \
54
+ 'error'
55
+
56
+# This snapshot will perform a reopen to drop R/W to RO.
57
+# It should still keep all permissions shared.
58
+success_or_failure=y _send_qemu_cmd $QEMU_HANDLE \
59
+ "{'execute': 'blockdev-snapshot-sync',
60
+ 'arguments': {
61
+ 'node-name': 'node0',
62
+ 'snapshot-file': '$TEST_IMG.overlay',
63
+ 'snapshot-node-name': 'node1'
64
+ } }" \
65
+ 'return' \
66
+ 'error'
67
+
68
+# Now open the same file again
69
+# This does not require any permissions (and does not unshare any), so
70
+# this will not conflict with node0.
71
+success_or_failure=y _send_qemu_cmd $QEMU_HANDLE \
72
+ "{'execute': 'blockdev-add',
73
+ 'arguments': {
74
+ 'node-name': 'node1',
75
+ 'driver': 'file',
76
+ 'filename': '$TEST_IMG',
77
+ 'locking': 'on'
78
+ } }" \
79
+ 'return' \
80
+ 'error'
81
+
82
+# Now we attach the image to a virtio-blk device. This device does
83
+# require some permissions (at least WRITE and READ_CONSISTENT), so if
84
+# reopening node0 unshared any (which it should not have), this will
85
+# fail (but it should not).
86
+success_or_failure=y _send_qemu_cmd $QEMU_HANDLE \
87
+ "{'execute': 'device_add',
88
+ 'arguments': {
89
+ 'driver': 'virtio-blk',
90
+ 'drive': 'node1'
91
+ } }" \
92
+ 'return' \
93
+ 'error'
94
+
95
+_cleanup_qemu
96
+
97
# success, all done
98
echo "*** done"
99
rm -f $seq.full
100
diff --git a/tests/qemu-iotests/182.out b/tests/qemu-iotests/182.out
101
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
102
--- a/tests/qemu-iotests/182.out
19
--- a/block/io_uring.c
103
+++ b/tests/qemu-iotests/182.out
20
+++ b/block/io_uring.c
104
@@ -XXX,XX +XXX,XX @@ Starting QEMU
21
@@ -XXX,XX +XXX,XX @@
105
Starting a second QEMU using the same image should fail
22
#include "qemu/osdep.h"
106
QEMU_PROG: -drive file=TEST_DIR/t.qcow2,if=none,id=drive0,file.locking=on: Failed to get "write" lock
23
#include <liburing.h>
107
Is another process using the image [TEST_DIR/t.qcow2]?
24
#include "block/aio.h"
108
+
25
+#include "qemu/error-report.h"
109
+=== Testing reopen ===
26
#include "qemu/queue.h"
110
+
27
#include "block/block.h"
111
+{"return": {}}
28
#include "block/raw-aio.h"
112
+{"return": {}}
113
+Formatting 'TEST_DIR/t.qcow2.overlay', fmt=qcow2 size=197120 backing_file=TEST_DIR/t.qcow2 backing_fmt=file cluster_size=65536 lazy_refcounts=off refcount_bits=16
114
+{"return": {}}
115
+{"return": {}}
116
+{"return": {}}
117
*** done
118
--
29
--
119
2.19.1
30
2.35.3
120
31
121
32
diff view generated by jsdifflib
1
From: Max Reitz <mreitz@redhat.com>
1
From: Xie Yongji <xieyongji@bytedance.com>
2
2
3
s->locked_shared_perm is the set of bits locked in the file, which is
3
In vduse_name_is_valid(), we actually check whether
4
the inverse of the permissions actually shared. So we need to pass them
4
the name is invalid or not. So let's change the
5
as they are to raw_apply_lock_bytes() instead of inverting them again.
5
function name to vduse_name_is_invalid() to match
6
the behavior.
6
7
7
Reported-by: Alberto Garcia <berto@igalia.com>
8
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
8
Signed-off-by: Max Reitz <mreitz@redhat.com>
9
Reviewed-by: Markus Armbruster <armbru@redhat.com>
9
Reviewed-by: Alberto Garcia <berto@igalia.com>
10
Message-Id: <20220706095624.328-2-xieyongji@bytedance.com>
10
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
11
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
11
---
12
---
12
block/file-posix.c | 2 +-
13
subprojects/libvduse/libvduse.c | 6 +++---
13
1 file changed, 1 insertion(+), 1 deletion(-)
14
1 file changed, 3 insertions(+), 3 deletions(-)
14
15
15
diff --git a/block/file-posix.c b/block/file-posix.c
16
diff --git a/subprojects/libvduse/libvduse.c b/subprojects/libvduse/libvduse.c
16
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
17
--- a/block/file-posix.c
18
--- a/subprojects/libvduse/libvduse.c
18
+++ b/block/file-posix.c
19
+++ b/subprojects/libvduse/libvduse.c
19
@@ -XXX,XX +XXX,XX @@ static void raw_reopen_commit(BDRVReopenState *state)
20
@@ -XXX,XX +XXX,XX @@ static int vduse_dev_init(VduseDev *dev, const char *name,
20
21
return 0;
21
/* Copy locks to the new fd before closing the old one. */
22
}
22
raw_apply_lock_bytes(NULL, rs->fd, s->locked_perm,
23
23
- ~s->locked_shared_perm, false, &local_err);
24
-static inline bool vduse_name_is_valid(const char *name)
24
+ s->locked_shared_perm, false, &local_err);
25
+static inline bool vduse_name_is_invalid(const char *name)
25
if (local_err) {
26
{
26
/* shouldn't fail in a sane host, but report it just in case. */
27
return strlen(name) >= VDUSE_NAME_MAX || strstr(name, "..");
27
error_report_err(local_err);
28
}
29
@@ -XXX,XX +XXX,XX @@ VduseDev *vduse_dev_create_by_name(const char *name, uint16_t num_queues,
30
VduseDev *dev;
31
int ret;
32
33
- if (!name || vduse_name_is_valid(name) || !ops ||
34
+ if (!name || vduse_name_is_invalid(name) || !ops ||
35
!ops->enable_queue || !ops->disable_queue) {
36
fprintf(stderr, "Invalid parameter for vduse\n");
37
return NULL;
38
@@ -XXX,XX +XXX,XX @@ VduseDev *vduse_dev_create(const char *name, uint32_t device_id,
39
struct vduse_dev_config *dev_config;
40
size_t size = offsetof(struct vduse_dev_config, config);
41
42
- if (!name || vduse_name_is_valid(name) ||
43
+ if (!name || vduse_name_is_invalid(name) ||
44
!has_feature(features, VIRTIO_F_VERSION_1) || !config ||
45
!config_size || !ops || !ops->enable_queue || !ops->disable_queue) {
46
fprintf(stderr, "Invalid parameter for vduse\n");
28
--
47
--
29
2.19.1
48
2.35.3
30
31
diff view generated by jsdifflib
1
From: Max Reitz <mreitz@redhat.com>
1
From: Xie Yongji <xieyongji@bytedance.com>
2
2
3
bdrv_reopen_multiple() does not invoke bdrv_reopen_abort() for the
3
Coverity reported a string overflow issue since we copied
4
element of the reopen queue for which bdrv_reopen_prepare() failed,
4
"name" to "dev_config->name" without checking the length.
5
because it assumes that the prepare function will have rolled back all
5
This should be a false positive since we already checked
6
changes already.
6
the length of "name" in vduse_name_is_invalid(). But anyway,
7
let's replace strcpy() with strncpy() (as a general library,
8
we'd like to minimize dependencies on other libraries, so we
9
didn't use g_strlcpy() here) to fix the coverity complaint.
7
10
8
However, bdrv_reopen_prepare() does not do this in every case: It may
11
Fixes: Coverity CID 1490224
9
notice an error after BlockDriver.bdrv_reopen_prepare() succeeded, and
12
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
10
it will not invoke BlockDriver.bdrv_reopen_abort() then; and neither
13
Reviewed-by: Markus Armbruster <armbru@redhat.com>
11
will bdrv_reopen_multiple(), as explained above.
14
Message-Id: <20220706095624.328-3-xieyongji@bytedance.com>
12
13
This is wrong because we must always call .bdrv_reopen_commit() or
14
.bdrv_reopen_abort() after .bdrv_reopen_prepare() has succeeded.
15
Otherwise, the block driver has no chance to undo what it has done in
16
its implementation of .bdrv_reopen_prepare().
17
18
To fix this, bdrv_reopen_prepare() has to call .bdrv_reopen_abort() if
19
it wants to return an error after .bdrv_reopen_prepare() has succeeded.
20
21
Signed-off-by: Max Reitz <mreitz@redhat.com>
22
Reviewed-by: Alberto Garcia <berto@igalia.com>
23
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
15
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
24
---
16
---
25
block.c | 12 ++++++++++++
17
subprojects/libvduse/libvduse.c | 3 ++-
26
1 file changed, 12 insertions(+)
18
1 file changed, 2 insertions(+), 1 deletion(-)
27
19
28
diff --git a/block.c b/block.c
20
diff --git a/subprojects/libvduse/libvduse.c b/subprojects/libvduse/libvduse.c
29
index XXXXXXX..XXXXXXX 100644
21
index XXXXXXX..XXXXXXX 100644
30
--- a/block.c
22
--- a/subprojects/libvduse/libvduse.c
31
+++ b/block.c
23
+++ b/subprojects/libvduse/libvduse.c
32
@@ -XXX,XX +XXX,XX @@ int bdrv_reopen_prepare(BDRVReopenState *reopen_state, BlockReopenQueue *queue,
24
@@ -XXX,XX +XXX,XX @@ VduseDev *vduse_dev_create(const char *name, uint32_t device_id,
33
QDict *orig_reopen_opts;
25
goto err_dev;
34
char *discard = NULL;
35
bool read_only;
36
+ bool drv_prepared = false;
37
38
assert(reopen_state != NULL);
39
assert(reopen_state->bs->drv != NULL);
40
@@ -XXX,XX +XXX,XX @@ int bdrv_reopen_prepare(BDRVReopenState *reopen_state, BlockReopenQueue *queue,
41
goto error;
42
}
26
}
43
27
44
+ drv_prepared = true;
28
- strcpy(dev_config->name, name);
45
+
29
+ strncpy(dev_config->name, name, VDUSE_NAME_MAX);
46
/* Options that are not handled are only okay if they are unchanged
30
+ dev_config->name[VDUSE_NAME_MAX - 1] = '\0';
47
* compared to the old state. It is expected that some options are only
31
dev_config->device_id = device_id;
48
* used for the initial open, but not reopen (e.g. filename) */
32
dev_config->vendor_id = vendor_id;
49
@@ -XXX,XX +XXX,XX @@ int bdrv_reopen_prepare(BDRVReopenState *reopen_state, BlockReopenQueue *queue,
33
dev_config->features = features;
50
reopen_state->options = qobject_ref(orig_reopen_opts);
51
52
error:
53
+ if (ret < 0 && drv_prepared) {
54
+ /* drv->bdrv_reopen_prepare() has succeeded, so we need to
55
+ * call drv->bdrv_reopen_abort() before signaling an error
56
+ * (bdrv_reopen_multiple() will not call bdrv_reopen_abort()
57
+ * when the respective bdrv_reopen_prepare() has failed) */
58
+ if (drv->bdrv_reopen_abort) {
59
+ drv->bdrv_reopen_abort(reopen_state);
60
+ }
61
+ }
62
qemu_opts_del(opts);
63
qobject_unref(orig_reopen_opts);
64
g_free(discard);
65
--
34
--
66
2.19.1
35
2.35.3
67
68
diff view generated by jsdifflib
1
From: Eric Blake <eblake@redhat.com>
1
From: Xie Yongji <xieyongji@bytedance.com>
2
2
3
If you have a capable file system (tmpfs is good, ext4 not so much;
3
The value passed to strerror() should be positive.
4
run ./check with TEST_DIR pointing to a good location so as not
4
So let's fix it.
5
to skip the test), it's actually possible to create a qcow2 file
6
that expands to a sparse 512T image with just over 38M of content.
7
The test is not the world's fastest (qemu crawling through 256M
8
bits of refcount table to find the next cluster to allocate takes
9
several seconds, as does qemu-img check reporting millions of
10
leaked clusters); but it DOES catch the problem that the previous
11
patch just fixed where writing a compressed cluster to a full
12
image ended up overwriting the wrong cluster.
13
5
14
Suggested-by: Max Reitz <mreitz@redhat.com>
6
Fixes: Coverity CID 1490226, 1490223
15
Signed-off-by: Eric Blake <eblake@redhat.com>
7
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
16
Reviewed-by: Alberto Garcia <berto@igalia.com>
8
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
9
Reviewed-by: Markus Armbruster <armbru@redhat.com>
10
Message-Id: <20220706095624.328-4-xieyongji@bytedance.com>
17
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
11
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
18
---
12
---
19
tests/qemu-iotests/220 | 96 ++++++++++++++++++++++++++++++++++++++
13
subprojects/libvduse/libvduse.c | 4 ++--
20
tests/qemu-iotests/220.out | 54 +++++++++++++++++++++
14
1 file changed, 2 insertions(+), 2 deletions(-)
21
tests/qemu-iotests/group | 1 +
22
3 files changed, 151 insertions(+)
23
create mode 100755 tests/qemu-iotests/220
24
create mode 100644 tests/qemu-iotests/220.out
25
15
26
diff --git a/tests/qemu-iotests/220 b/tests/qemu-iotests/220
16
diff --git a/subprojects/libvduse/libvduse.c b/subprojects/libvduse/libvduse.c
27
new file mode 100755
28
index XXXXXXX..XXXXXXX
29
--- /dev/null
30
+++ b/tests/qemu-iotests/220
31
@@ -XXX,XX +XXX,XX @@
32
+#!/bin/bash
33
+#
34
+# max limits on compression in huge qcow2 files
35
+#
36
+# Copyright (C) 2018 Red Hat, Inc.
37
+#
38
+# This program is free software; you can redistribute it and/or modify
39
+# it under the terms of the GNU General Public License as published by
40
+# the Free Software Foundation; either version 2 of the License, or
41
+# (at your option) any later version.
42
+#
43
+# This program is distributed in the hope that it will be useful,
44
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
45
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
46
+# GNU General Public License for more details.
47
+#
48
+# You should have received a copy of the GNU General Public License
49
+# along with this program. If not, see <http://www.gnu.org/licenses/>.
50
+#
51
+
52
+seq=$(basename $0)
53
+echo "QA output created by $seq"
54
+
55
+status=1 # failure is the default!
56
+
57
+_cleanup()
58
+{
59
+ _cleanup_test_img
60
+}
61
+trap "_cleanup; exit \$status" 0 1 2 3 15
62
+
63
+# get standard environment, filters and checks
64
+. ./common.rc
65
+. ./common.filter
66
+. ./common.pattern
67
+
68
+_supported_fmt qcow2
69
+_supported_proto file
70
+_supported_os Linux
71
+
72
+echo "== Creating huge file =="
73
+
74
+# Sanity check: We require a file system that permits the creation
75
+# of a HUGE (but very sparse) file. tmpfs works, ext4 does not.
76
+if ! truncate --size=513T "$TEST_IMG"; then
77
+ _notrun "file system on $TEST_DIR does not support large enough files"
78
+fi
79
+rm "$TEST_IMG"
80
+IMGOPTS='cluster_size=2M,refcount_bits=1' _make_test_img 513T
81
+
82
+echo "== Populating refcounts =="
83
+# We want an image with 256M refcounts * 2M clusters = 512T referenced.
84
+# Each 2M cluster holds 16M refcounts; the refcount table initially uses
85
+# 1 refblock, so we need to add 15 more. The refcount table lives at 2M,
86
+# first refblock at 4M, L2 at 6M, so our remaining additions start at 8M.
87
+# Then, for each refblock, mark it as fully populated.
88
+to_hex() {
89
+ printf %016x\\n $1 | sed 's/\(..\)/\\x\1/g'
90
+}
91
+truncate --size=38m "$TEST_IMG"
92
+entry=$((0x200000))
93
+$QEMU_IO_PROG -f raw -c "w -P 0xff 4m 2m" "$TEST_IMG" | _filter_qemu_io
94
+for i in {1..15}; do
95
+ offs=$((0x600000 + i*0x200000))
96
+ poke_file "$TEST_IMG" $((i*8 + entry)) $(to_hex $offs)
97
+ $QEMU_IO_PROG -f raw -c "w -P 0xff $offs 2m" "$TEST_IMG" | _filter_qemu_io
98
+done
99
+
100
+echo "== Checking file before =="
101
+# FIXME: 'qemu-img check' doesn't diagnose refcounts beyond the end of
102
+# the file as leaked clusters
103
+_check_test_img 2>&1 | sed '/^Leaked cluster/d'
104
+stat -c 'image size %s' "$TEST_IMG"
105
+
106
+echo "== Trying to write compressed cluster =="
107
+# Given our file size, the next available cluster at 512T lies beyond the
108
+# maximum offset that a compressed 2M cluster can reside in
109
+$QEMU_IO_PROG -c 'w -c 0 2m' "$TEST_IMG" | _filter_qemu_io
110
+# The attempt failed, but ended up allocating a new refblock
111
+stat -c 'image size %s' "$TEST_IMG"
112
+
113
+echo "== Writing normal cluster =="
114
+# The failed write should not corrupt the image, so a normal write succeeds
115
+$QEMU_IO_PROG -c 'w 0 2m' "$TEST_IMG" | _filter_qemu_io
116
+
117
+echo "== Checking file after =="
118
+# qemu-img now sees the millions of leaked clusters, thanks to the allocations
119
+# at 512T. Undo many of our faked references to speed up the check.
120
+$QEMU_IO_PROG -f raw -c "w -z 5m 1m" -c "w -z 8m 30m" "$TEST_IMG" |
121
+ _filter_qemu_io
122
+_check_test_img 2>&1 | sed '/^Leaked cluster/d'
123
+
124
+# success, all done
125
+echo "*** done"
126
+rm -f $seq.full
127
+status=0
128
diff --git a/tests/qemu-iotests/220.out b/tests/qemu-iotests/220.out
129
new file mode 100644
130
index XXXXXXX..XXXXXXX
131
--- /dev/null
132
+++ b/tests/qemu-iotests/220.out
133
@@ -XXX,XX +XXX,XX @@
134
+QA output created by 220
135
+== Creating huge file ==
136
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=564049465049088
137
+== Populating refcounts ==
138
+wrote 2097152/2097152 bytes at offset 4194304
139
+2 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
140
+wrote 2097152/2097152 bytes at offset 8388608
141
+2 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
142
+wrote 2097152/2097152 bytes at offset 10485760
143
+2 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
144
+wrote 2097152/2097152 bytes at offset 12582912
145
+2 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
146
+wrote 2097152/2097152 bytes at offset 14680064
147
+2 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
148
+wrote 2097152/2097152 bytes at offset 16777216
149
+2 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
150
+wrote 2097152/2097152 bytes at offset 18874368
151
+2 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
152
+wrote 2097152/2097152 bytes at offset 20971520
153
+2 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
154
+wrote 2097152/2097152 bytes at offset 23068672
155
+2 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
156
+wrote 2097152/2097152 bytes at offset 25165824
157
+2 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
158
+wrote 2097152/2097152 bytes at offset 27262976
159
+2 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
160
+wrote 2097152/2097152 bytes at offset 29360128
161
+2 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
162
+wrote 2097152/2097152 bytes at offset 31457280
163
+2 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
164
+wrote 2097152/2097152 bytes at offset 33554432
165
+2 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
166
+wrote 2097152/2097152 bytes at offset 35651584
167
+2 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
168
+wrote 2097152/2097152 bytes at offset 37748736
169
+2 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
170
+== Checking file before ==
171
+No errors were found on the image.
172
+image size 39845888
173
+== Trying to write compressed cluster ==
174
+write failed: Input/output error
175
+image size 562949957615616
176
+== Writing normal cluster ==
177
+wrote 2097152/2097152 bytes at offset 0
178
+2 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
179
+== Checking file after ==
180
+wrote 1048576/1048576 bytes at offset 5242880
181
+1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
182
+wrote 31457280/31457280 bytes at offset 8388608
183
+30 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
184
+
185
+8388589 leaked clusters were found on the image.
186
+This means waste of disk space, but no harm to data.
187
+*** done
188
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
189
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
190
--- a/tests/qemu-iotests/group
18
--- a/subprojects/libvduse/libvduse.c
191
+++ b/tests/qemu-iotests/group
19
+++ b/subprojects/libvduse/libvduse.c
192
@@ -XXX,XX +XXX,XX @@
20
@@ -XXX,XX +XXX,XX @@ VduseDev *vduse_dev_create_by_name(const char *name, uint16_t num_queues,
193
217 rw auto quick
21
ret = vduse_dev_init(dev, name, num_queues, ops, priv);
194
218 rw auto quick
22
if (ret < 0) {
195
219 rw auto
23
fprintf(stderr, "Failed to init vduse device %s: %s\n",
196
+220 rw auto
24
- name, strerror(ret));
197
221 rw auto quick
25
+ name, strerror(-ret));
198
222 rw auto quick
26
free(dev);
199
223 rw auto quick
27
return NULL;
28
}
29
@@ -XXX,XX +XXX,XX @@ VduseDev *vduse_dev_create(const char *name, uint32_t device_id,
30
ret = vduse_dev_init(dev, name, num_queues, ops, priv);
31
if (ret < 0) {
32
fprintf(stderr, "Failed to init vduse device %s: %s\n",
33
- name, strerror(ret));
34
+ name, strerror(-ret));
35
goto err;
36
}
37
200
--
38
--
201
2.19.1
39
2.35.3
202
203
diff view generated by jsdifflib
1
From: Li Qiang <liq3ea@gmail.com>
1
From: Lev Kujawski <lkujaw@member.fsf.org>
2
2
3
Currently, the nvme_cmb_ops mr doesn't check the addr and size.
3
For small disk images (<4 GiB), QEMU and SeaBIOS default to the
4
This can lead an oob access issue. This is triggerable in the guest.
4
LARGE/ECHS disk translation method, but it is not uncommon for other
5
Add check to avoid this issue.
5
BIOS software to use LBA in these cases as well. Some operating
6
system boot loaders (e.g., NT 4) do not handle LARGE translations
7
outside of fixed configurations. See, e.g., Q154052:
6
8
7
Fixes CVE-2018-16847.
9
"When starting an x86 based computer, Ntdetect.com retrieves and
10
stores Interrupt 13 information. . . If the disk controller is using a
11
32 sector/64 head translation scheme, this boundary will be 1 GB. If
12
the controller uses 63 sector/255 head translation [AUTHOR: i.e.,
13
LBA], the limit will be 4 GB."
8
14
9
Reported-by: Li Qiang <liq3ea@gmail.com>
15
To accommodate these situations, hd_geometry_guess() now follows the
10
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
16
disk translation specified by the user even when the ATA disk geometry
11
Signed-off-by: Li Qiang <liq3ea@gmail.com>
17
is guessed.
18
19
hd_geometry_guess():
20
* Only set the disk translation when translation is AUTO.
21
* Show the soon-to-be active translation (*ptrans) in the trace rather
22
than what was guessed.
23
24
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/56
25
Buglink: https://bugs.launchpad.net/qemu/+bug/1745312
26
27
Signed-off-by: Lev Kujawski <lkujaw@member.fsf.org>
28
Message-Id: <20220707204045.999544-1-lkujaw@member.fsf.org>
12
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
29
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
13
---
30
---
14
hw/block/nvme.c | 7 +++++++
31
hw/block/hd-geometry.c | 7 ++++++-
15
1 file changed, 7 insertions(+)
32
1 file changed, 6 insertions(+), 1 deletion(-)
16
33
17
diff --git a/hw/block/nvme.c b/hw/block/nvme.c
34
diff --git a/hw/block/hd-geometry.c b/hw/block/hd-geometry.c
18
index XXXXXXX..XXXXXXX 100644
35
index XXXXXXX..XXXXXXX 100644
19
--- a/hw/block/nvme.c
36
--- a/hw/block/hd-geometry.c
20
+++ b/hw/block/nvme.c
37
+++ b/hw/block/hd-geometry.c
21
@@ -XXX,XX +XXX,XX @@ static void nvme_cmb_write(void *opaque, hwaddr addr, uint64_t data,
38
@@ -XXX,XX +XXX,XX @@ void hd_geometry_guess(BlockBackend *blk,
22
unsigned size)
39
translation = BIOS_ATA_TRANSLATION_NONE;
23
{
40
}
24
NvmeCtrl *n = (NvmeCtrl *)opaque;
41
if (ptrans) {
25
+
42
- *ptrans = translation;
26
+ if (addr + size > NVME_CMBSZ_GETSIZE(n->bar.cmbsz)) {
43
+ if (*ptrans == BIOS_ATA_TRANSLATION_AUTO) {
27
+ return;
44
+ *ptrans = translation;
28
+ }
45
+ } else {
29
memcpy(&n->cmbuf[addr], &data, size);
46
+ /* Defer to the translation specified by the user. */
30
}
47
+ translation = *ptrans;
31
48
+ }
32
@@ -XXX,XX +XXX,XX @@ static uint64_t nvme_cmb_read(void *opaque, hwaddr addr, unsigned size)
49
}
33
uint64_t val;
50
trace_hd_geometry_guess(blk, *pcyls, *pheads, *psecs, translation);
34
NvmeCtrl *n = (NvmeCtrl *)opaque;
35
36
+ if (addr + size > NVME_CMBSZ_GETSIZE(n->bar.cmbsz)) {
37
+ return 0;
38
+ }
39
memcpy(&val, &n->cmbuf[addr], size);
40
return val;
41
}
51
}
42
--
52
--
43
2.19.1
53
2.35.3
44
45
diff view generated by jsdifflib
Deleted patch
1
From: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
2
1
3
Commit c8a35f1cf0f "fdc: use IsaDma interface instead of global DMA_*
4
functions" accidentally introduced a segfault in fdctrl_stop_transfer() for
5
non-DMA transfers.
6
7
If fdctrl->dma_chann has not been configured then the fdctrl->dma interface
8
reference isn't initialised during isabus_fdc_realize(). Unfortunately
9
fdctrl_stop_transfer() unconditionally references the DMA interface when
10
finishing the transfer causing a NULL pointer dereference.
11
12
Fix the issue by adding a check in fdctrl_stop_transfer() so that the DMA
13
interface reference and release method is only invoked if fdctrl->dma_chann
14
has been set.
15
16
(This issue was discovered by Martin testing a recent change in the NetBSD
17
installer under qemu-system-sparc)
18
19
Cc: qemu-stable@nongnu.org
20
Reported-by: Martin Husemann <martin@duskware.de>
21
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
22
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
23
Reviewed-by: Hervé Poussineau <hpoussin@reactos.org>
24
Reviewed-by: John Snow <jsnow@redhat.com>
25
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
26
---
27
hw/block/fdc.c | 2 +-
28
1 file changed, 1 insertion(+), 1 deletion(-)
29
30
diff --git a/hw/block/fdc.c b/hw/block/fdc.c
31
index XXXXXXX..XXXXXXX 100644
32
--- a/hw/block/fdc.c
33
+++ b/hw/block/fdc.c
34
@@ -XXX,XX +XXX,XX @@ static void fdctrl_stop_transfer(FDCtrl *fdctrl, uint8_t status0,
35
fdctrl->fifo[5] = cur_drv->sect;
36
fdctrl->fifo[6] = FD_SECTOR_SC;
37
fdctrl->data_dir = FD_DIR_READ;
38
- if (!(fdctrl->msr & FD_MSR_NONDMA)) {
39
+ if (fdctrl->dma_chann != -1 && !(fdctrl->msr & FD_MSR_NONDMA)) {
40
IsaDmaClass *k = ISADMA_GET_CLASS(fdctrl->dma);
41
k->release_DREQ(fdctrl->dma, fdctrl->dma_chann);
42
}
43
--
44
2.19.1
45
46
diff view generated by jsdifflib
Deleted patch
1
Don't leak 'cluster' in the mapping == NULL case. Found by Coverity
2
(CID 1055918).
3
1
4
Fixes: 8d9401c2791ee2d2805b741b1ee3006041edcd3e
5
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
6
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
7
Reviewed-by: Liam Merwick <liam.merwick@oracle.com>
8
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
9
---
10
block/vvfat.c | 6 +++---
11
1 file changed, 3 insertions(+), 3 deletions(-)
12
13
diff --git a/block/vvfat.c b/block/vvfat.c
14
index XXXXXXX..XXXXXXX 100644
15
--- a/block/vvfat.c
16
+++ b/block/vvfat.c
17
@@ -XXX,XX +XXX,XX @@ static int commit_one_file(BDRVVVFATState* s,
18
uint32_t first_cluster = c;
19
mapping_t* mapping = find_mapping_for_cluster(s, c);
20
uint32_t size = filesize_of_direntry(direntry);
21
- char* cluster = g_malloc(s->cluster_size);
22
+ char *cluster;
23
uint32_t i;
24
int fd = 0;
25
26
@@ -XXX,XX +XXX,XX @@ static int commit_one_file(BDRVVVFATState* s,
27
if (fd < 0) {
28
fprintf(stderr, "Could not open %s... (%s, %d)\n", mapping->path,
29
strerror(errno), errno);
30
- g_free(cluster);
31
return fd;
32
}
33
if (offset > 0) {
34
if (lseek(fd, offset, SEEK_SET) != offset) {
35
qemu_close(fd);
36
- g_free(cluster);
37
return -3;
38
}
39
}
40
41
+ cluster = g_malloc(s->cluster_size);
42
+
43
while (offset < size) {
44
uint32_t c1;
45
int rest_size = (size - offset > s->cluster_size ?
46
--
47
2.19.1
48
49
diff view generated by jsdifflib
1
From: Eric Blake <eblake@redhat.com>
1
From: Cole Robinson <crobinso@redhat.com>
2
2
3
Our code was already checking that we did not attempt to
3
qemu-iotests fails in the following setup:
4
allocate more clusters than what would fit in an INT64 (the
5
physical maximimum if we can access a full off_t's worth of
6
data). But this does not catch smaller limits enforced by
7
various spots in the qcow2 image description: L1 and normal
8
clusters of L2 are documented as having bits 63-56 reserved
9
for other purposes, capping our maximum offset at 64PB (bit
10
55 is the maximum bit set). And for compressed images with
11
2M clusters, the cap drops the maximum offset to bit 48, or
12
a maximum offset of 512TB. If we overflow that offset, we
13
would write compressed data into one place, but try to
14
decompress from another, which won't work.
15
4
16
It's actually possible to prove that overflow can cause image
5
./configure --enable-modules --enable-smartcard \
17
corruption without this patch; I'll add the iotests separately
6
--target-list=x86_64-softmmu,s390x-softmmu
18
in the next commit.
7
make
8
cd build
9
QEMU_PROG=`pwd`/s390x-softmmu/qemu-system-s390x \
10
../tests/check-block.sh qcow2
11
...
12
--- /home/crobinso/src/qemu/tests/qemu-iotests/127.out
13
+++ /home/crobinso/src/qemu/build/tests/qemu-iotests/scratch/127.out.bad
14
@@ -1,4 +1,18 @@
15
QA output created by 127
16
+Failed to open module: /home/crobinso/src/qemu/build/hw-usb-smartcard.so: undefined symbol: ccid_card_ccid_attach
17
...
18
--- /home/crobinso/src/qemu/tests/qemu-iotests/267.out
19
+++ /home/crobinso/src/qemu/build/tests/qemu-iotests/scratch/267.out.bad
20
@@ -1,4 +1,11 @@
21
QA output created by 267
22
+Failed to open module: /home/crobinso/src/qemu/build/hw-usb-smartcard.so: undefined symbol: ccid_card_ccid_attach
19
23
20
Signed-off-by: Eric Blake <eblake@redhat.com>
24
The stderr spew is its own known issue, but seems like iotests should
21
Reviewed-by: Alberto Garcia <berto@igalia.com>
25
be discarding stderr in this case.
26
27
Signed-off-by: Cole Robinson <crobinso@redhat.com>
28
Reviewed-by: Thomas Huth <thuth@redhat.com>
22
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
29
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
23
---
30
---
24
block/qcow2.h | 6 ++++++
31
tests/qemu-iotests/common.rc | 4 ++--
25
block/qcow2-refcount.c | 20 +++++++++++++-------
32
1 file changed, 2 insertions(+), 2 deletions(-)
26
2 files changed, 19 insertions(+), 7 deletions(-)
27
33
28
diff --git a/block/qcow2.h b/block/qcow2.h
34
diff --git a/tests/qemu-iotests/common.rc b/tests/qemu-iotests/common.rc
29
index XXXXXXX..XXXXXXX 100644
35
index XXXXXXX..XXXXXXX 100644
30
--- a/block/qcow2.h
36
--- a/tests/qemu-iotests/common.rc
31
+++ b/block/qcow2.h
37
+++ b/tests/qemu-iotests/common.rc
32
@@ -XXX,XX +XXX,XX @@
38
@@ -XXX,XX +XXX,XX @@ _require_large_file()
33
#define QCOW_MAX_CRYPT_CLUSTERS 32
39
#
34
#define QCOW_MAX_SNAPSHOTS 65536
40
_require_devices()
35
36
+/* Field widths in qcow2 mean normal cluster offsets cannot reach
37
+ * 64PB; depending on cluster size, compressed clusters can have a
38
+ * smaller limit (64PB for up to 16k clusters, then ramps down to
39
+ * 512TB for 2M clusters). */
40
+#define QCOW_MAX_CLUSTER_OFFSET ((1ULL << 56) - 1)
41
+
42
/* 8 MB refcount table is enough for 2 PB images at 64k cluster size
43
* (128 GB for 512 byte clusters, 2 EB for 2 MB clusters) */
44
#define QCOW_MAX_REFTABLE_SIZE S_8MiB
45
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
46
index XXXXXXX..XXXXXXX 100644
47
--- a/block/qcow2-refcount.c
48
+++ b/block/qcow2-refcount.c
49
@@ -XXX,XX +XXX,XX @@
50
#include "qemu/bswap.h"
51
#include "qemu/cutils.h"
52
53
-static int64_t alloc_clusters_noref(BlockDriverState *bs, uint64_t size);
54
+static int64_t alloc_clusters_noref(BlockDriverState *bs, uint64_t size,
55
+ uint64_t max);
56
static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs,
57
int64_t offset, int64_t length, uint64_t addend,
58
bool decrease, enum qcow2_discard_type type);
59
@@ -XXX,XX +XXX,XX @@ static int alloc_refcount_block(BlockDriverState *bs,
60
}
61
62
/* Allocate the refcount block itself and mark it as used */
63
- int64_t new_block = alloc_clusters_noref(bs, s->cluster_size);
64
+ int64_t new_block = alloc_clusters_noref(bs, s->cluster_size, INT64_MAX);
65
if (new_block < 0) {
66
return new_block;
67
}
68
@@ -XXX,XX +XXX,XX @@ int qcow2_update_cluster_refcount(BlockDriverState *bs,
69
70
71
/* return < 0 if error */
72
-static int64_t alloc_clusters_noref(BlockDriverState *bs, uint64_t size)
73
+static int64_t alloc_clusters_noref(BlockDriverState *bs, uint64_t size,
74
+ uint64_t max)
75
{
41
{
76
BDRVQcow2State *s = bs->opaque;
42
- available=$($QEMU -M none -device help | \
77
uint64_t i, nb_clusters, refcount;
43
+ available=$($QEMU -M none -device help 2> /dev/null | \
78
@@ -XXX,XX +XXX,XX @@ retry:
44
grep ^name | sed -e 's/^name "//' -e 's/".*$//')
79
}
45
for device
80
46
do
81
/* Make sure that all offsets in the "allocated" range are representable
47
@@ -XXX,XX +XXX,XX @@ _require_devices()
82
- * in an int64_t */
48
83
+ * in the requested max */
49
_require_one_device_of()
84
if (s->free_cluster_index > 0 &&
50
{
85
- s->free_cluster_index - 1 > (INT64_MAX >> s->cluster_bits))
51
- available=$($QEMU -M none -device help | \
86
+ s->free_cluster_index - 1 > (max >> s->cluster_bits))
52
+ available=$($QEMU -M none -device help 2> /dev/null | \
87
{
53
grep ^name | sed -e 's/^name "//' -e 's/".*$//')
88
return -EFBIG;
54
for device
89
}
55
do
90
@@ -XXX,XX +XXX,XX @@ int64_t qcow2_alloc_clusters(BlockDriverState *bs, uint64_t size)
91
92
BLKDBG_EVENT(bs->file, BLKDBG_CLUSTER_ALLOC);
93
do {
94
- offset = alloc_clusters_noref(bs, size);
95
+ offset = alloc_clusters_noref(bs, size, QCOW_MAX_CLUSTER_OFFSET);
96
if (offset < 0) {
97
return offset;
98
}
99
@@ -XXX,XX +XXX,XX @@ int64_t qcow2_alloc_bytes(BlockDriverState *bs, int size)
100
free_in_cluster = s->cluster_size - offset_into_cluster(s, offset);
101
do {
102
if (!offset || free_in_cluster < size) {
103
- int64_t new_cluster = alloc_clusters_noref(bs, s->cluster_size);
104
+ int64_t new_cluster;
105
+
106
+ new_cluster = alloc_clusters_noref(bs, s->cluster_size,
107
+ MIN(s->cluster_offset_mask,
108
+ QCOW_MAX_CLUSTER_OFFSET));
109
if (new_cluster < 0) {
110
return new_cluster;
111
}
112
--
56
--
113
2.19.1
57
2.35.3
114
115
diff view generated by jsdifflib
1
From: Eric Blake <eblake@redhat.com>
1
From: Emanuele Giuseppe Esposito <eesposit@redhat.com>
2
2
3
Although off_t permits up to 63 bits (8EB) of file offsets, in
3
If we go directly to GLOBAL_STATE_CODE, IO_CODE or IO_OR_GS_CODE
4
practice, we're going to hit other limits first. Document some
4
definition, we just find that they "mark and check that the function
5
of those limits in the qcow2 spec (some are inherent, others are
5
is part of the {category} API".
6
implementation choices of qemu), and how choice of cluster size
6
However, ther is no definition on what {category} API is, they are
7
can influence some of the limits.
7
in include/block/block-*.h
8
Therefore, add a comment that refers to such documentation.
8
9
9
While we cannot map any uncompressed virtual cluster to any
10
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
10
address higher than 64 PB (56 bits) (due to the current L1/L2
11
Message-Id: <20220609122206.1016936-1-eesposit@redhat.com>
11
field encoding stopping at bit 55), qemu's cap of 8M for the
12
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
12
refcount table can still access larger host addresses for some
13
combinations of large clusters and small refcount_order. For
14
comparison, ext4 with 4k blocks caps files at 16PB.
15
16
Another interesting limit: for compressed clusters, the L2 layout
17
requires an ever-smaller maximum host offset as cluster size gets
18
larger, down to a 512 TB maximum with 2M clusters. In particular,
19
note that with a cluster size of 8k or smaller, the L2 entry for
20
a compressed cluster could technically point beyond the 64PB mark,
21
but when you consider that with 8k clusters and refcount_order = 0,
22
you cannot access beyond 512T without exceeding qemu's limit of an
23
8M cap on the refcount table, it is unlikely that any image in the
24
wild has attempted to do so. To be safe, let's document that bits
25
beyond 55 in a compressed cluster must be 0.
26
27
Signed-off-by: Eric Blake <eblake@redhat.com>
28
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
13
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
29
---
14
---
30
docs/interop/qcow2.txt | 38 ++++++++++++++++++++++++++++++++++++--
15
include/qemu/main-loop.h | 18 +++++++++++++++---
31
1 file changed, 36 insertions(+), 2 deletions(-)
16
1 file changed, 15 insertions(+), 3 deletions(-)
32
17
33
diff --git a/docs/interop/qcow2.txt b/docs/interop/qcow2.txt
18
diff --git a/include/qemu/main-loop.h b/include/qemu/main-loop.h
34
index XXXXXXX..XXXXXXX 100644
19
index XXXXXXX..XXXXXXX 100644
35
--- a/docs/interop/qcow2.txt
20
--- a/include/qemu/main-loop.h
36
+++ b/docs/interop/qcow2.txt
21
+++ b/include/qemu/main-loop.h
37
@@ -XXX,XX +XXX,XX @@ The first cluster of a qcow2 image contains the file header:
22
@@ -XXX,XX +XXX,XX @@ bool qemu_mutex_iothread_locked(void);
38
with larger cluster sizes.
23
*/
39
24
bool qemu_in_main_thread(void);
40
24 - 31: size
25
41
- Virtual disk size in bytes
26
-/* Mark and check that the function is part of the global state API. */
42
+ Virtual disk size in bytes.
27
+/*
43
+
28
+ * Mark and check that the function is part of the Global State API.
44
+ Note: qemu has an implementation limit of 32 MB as
29
+ * Please refer to include/block/block-global-state.h for more
45
+ the maximum L1 table size. With a 2 MB cluster
30
+ * information about GS API.
46
+ size, it is unable to populate a virtual cluster
31
+ */
47
+ beyond 2 EB (61 bits); with a 512 byte cluster
32
#ifdef CONFIG_COCOA
48
+ size, it is unable to populate a virtual size
33
/*
49
+ larger than 128 GB (37 bits). Meanwhile, L1/L2
34
* When using the Cocoa UI, addRemovableDevicesMenuItems() is called from
50
+ table layouts limit an image to no more than 64 PB
35
@@ -XXX,XX +XXX,XX @@ bool qemu_in_main_thread(void);
51
+ (56 bits) of populated clusters, and an image may
36
} while (0)
52
+ hit other limits first (such as a file system's
37
#endif /* CONFIG_COCOA */
53
+ maximum size).
38
54
39
-/* Mark and check that the function is part of the I/O API. */
55
32 - 35: crypt_method
40
+/*
56
0 for no encryption
41
+ * Mark and check that the function is part of the I/O API.
57
@@ -XXX,XX +XXX,XX @@ in the image file.
42
+ * Please refer to include/block/block-io.h for more
58
It contains pointers to the second level structures which are called refcount
43
+ * information about IO API.
59
blocks and are exactly one cluster in size.
44
+ */
60
45
#define IO_CODE() \
61
+Although a large enough refcount table can reserve clusters past 64 PB
46
do { \
62
+(56 bits) (assuming the underlying protocol can even be sized that
47
/* nop */ \
63
+large), note that some qcow2 metadata such as L1/L2 tables must point
48
} while (0)
64
+to clusters prior to that point.
49
65
+
50
-/* Mark and check that the function is part of the "I/O OR GS" API. */
66
+Note: qemu has an implementation limit of 8 MB as the maximum refcount
51
+/*
67
+table size. With a 2 MB cluster size and a default refcount_order of
52
+ * Mark and check that the function is part of the "I/O OR GS" API.
68
+4, it is unable to reference host resources beyond 2 EB (61 bits); in
53
+ * Please refer to include/block/block-io.h for more
69
+the worst case, with a 512 cluster size and refcount_order of 6, it is
54
+ * information about "IO or GS" API.
70
+unable to access beyond 32 GB (35 bits).
55
+ */
71
+
56
#define IO_OR_GS_CODE() \
72
Given an offset into the image file, the refcount of its cluster can be
57
do { \
73
obtained as follows:
58
/* nop */ \
74
75
@@ -XXX,XX +XXX,XX @@ The L1 table has a variable size (stored in the header) and may use multiple
76
clusters, however it must be contiguous in the image file. L2 tables are
77
exactly one cluster in size.
78
79
+The L1 and L2 tables have implications on the maximum virtual file
80
+size; for a given L1 table size, a larger cluster size is required for
81
+the guest to have access to more space. Furthermore, a virtual
82
+cluster must currently map to a host offset below 64 PB (56 bits)
83
+(although this limit could be relaxed by putting reserved bits into
84
+use). Additionally, as cluster size increases, the maximum host
85
+offset for a compressed cluster is reduced (a 2M cluster size requires
86
+compressed clusters to reside below 512 TB (49 bits), and this limit
87
+cannot be relaxed without an incompatible layout change).
88
+
89
Given an offset into the virtual disk, the offset into the image file can be
90
obtained as follows:
91
92
@@ -XXX,XX +XXX,XX @@ Standard Cluster Descriptor:
93
Compressed Clusters Descriptor (x = 62 - (cluster_bits - 8)):
94
95
Bit 0 - x-1: Host cluster offset. This is usually _not_ aligned to a
96
- cluster or sector boundary!
97
+ cluster or sector boundary! If cluster_bits is
98
+ small enough that this field includes bits beyond
99
+ 55, those upper bits must be set to 0.
100
101
x - 61: Number of additional 512-byte sectors used for the
102
compressed data, beyond the sector containing the offset
103
--
59
--
104
2.19.1
60
2.35.3
105
106
diff view generated by jsdifflib