1
The following changes since commit 9c125d17e9402c232c46610802e5931b3639d77b:
1
The following changes since commit c1eb2ddf0f8075faddc5f7c3d39feae3e8e9d6b4:
2
2
3
Merge tag 'pull-tcg-20220420' of https://gitlab.com/rth7680/qemu into staging (2022-04-20 16:43:11 -0700)
3
Update version for v8.0.0 release (2023-04-19 17:27:13 +0100)
4
4
5
are available in the Git repository at:
5
are available in the Git repository at:
6
6
7
https://gitlab.com/stefanha/qemu.git tags/block-pull-request
7
https://gitlab.com/stefanha/qemu.git tags/block-pull-request
8
8
9
for you to fetch changes up to d45c83328feab2e4083991693160f0a417cfd9b0:
9
for you to fetch changes up to 36e5e9b22abe56aa00ca067851555ad8127a7966:
10
10
11
virtiofsd: Add docs/helper for killpriv_v2/no_killpriv_v2 option (2022-04-21 12:05:15 +0200)
11
tracing: install trace events file only if necessary (2023-04-20 07:39:43 -0400)
12
12
13
----------------------------------------------------------------
13
----------------------------------------------------------------
14
Pull request
14
Pull request
15
15
16
Small contrib/vhost-user-blk, contrib/vhost-user-scsi, and tools/virtiofsd
16
Sam Li's zoned storage work and fixes I collected during the 8.0 freeze.
17
improvements.
18
17
19
----------------------------------------------------------------
18
----------------------------------------------------------------
20
19
21
Liu Yiding (1):
20
Carlos Santos (1):
22
virtiofsd: Add docs/helper for killpriv_v2/no_killpriv_v2 option
21
tracing: install trace events file only if necessary
23
22
24
Sakshi Kaushik (1):
23
Philippe Mathieu-Daudé (1):
25
Implements Backend Program conventions for vhost-user-scsi
24
block/dmg: Declare a type definition for DMG uncompress function
26
25
27
Stefan Hajnoczi (1):
26
Sam Li (17):
28
contrib/vhost-user-blk: add missing GOptionEntry NULL terminator
27
block/block-common: add zoned device structs
28
block/file-posix: introduce helper functions for sysfs attributes
29
block/block-backend: add block layer APIs resembling Linux
30
ZonedBlockDevice ioctls
31
block/raw-format: add zone operations to pass through requests
32
block: add zoned BlockDriver check to block layer
33
iotests: test new zone operations
34
block: add some trace events for new block layer APIs
35
docs/zoned-storage: add zoned device documentation
36
file-posix: add tracking of the zone write pointers
37
block: introduce zone append write for zoned devices
38
qemu-iotests: test zone append operation
39
block: add some trace events for zone append
40
include: update virtio_blk headers to v6.3-rc1
41
virtio-blk: add zoned storage emulation for zoned devices
42
block: add accounting for zone append operation
43
virtio-blk: add some trace events for zoned emulation
44
docs/zoned-storage:add zoned emulation use case
29
45
30
docs/tools/virtiofsd.rst | 5 ++
46
Thomas De Schampheleire (1):
31
contrib/vhost-user-blk/vhost-user-blk.c | 3 +-
47
tracetool: use relative paths for '#line' preprocessor directives
32
contrib/vhost-user-scsi/vhost-user-scsi.c | 77 +++++++++++++++--------
48
33
tools/virtiofsd/helper.c | 3 +
49
docs/devel/index-api.rst | 1 +
34
4 files changed, 62 insertions(+), 26 deletions(-)
50
docs/devel/zoned-storage.rst | 62 ++
51
qapi/block-core.json | 68 +-
52
qapi/block.json | 4 +
53
meson.build | 4 +
54
block/dmg.h | 8 +-
55
include/block/accounting.h | 1 +
56
include/block/block-common.h | 57 ++
57
include/block/block-io.h | 13 +
58
include/block/block_int-common.h | 37 +
59
include/block/raw-aio.h | 8 +-
60
include/standard-headers/drm/drm_fourcc.h | 12 +
61
include/standard-headers/linux/ethtool.h | 48 +-
62
include/standard-headers/linux/fuse.h | 45 +-
63
include/standard-headers/linux/pci_regs.h | 1 +
64
include/standard-headers/linux/vhost_types.h | 2 +
65
include/standard-headers/linux/virtio_blk.h | 105 +++
66
include/sysemu/block-backend-io.h | 27 +
67
linux-headers/asm-arm64/kvm.h | 1 +
68
linux-headers/asm-x86/kvm.h | 34 +-
69
linux-headers/linux/kvm.h | 9 +
70
linux-headers/linux/vfio.h | 15 +-
71
linux-headers/linux/vhost.h | 8 +
72
block.c | 19 +
73
block/block-backend.c | 193 ++++++
74
block/dmg.c | 7 +-
75
block/file-posix.c | 677 +++++++++++++++++--
76
block/io.c | 68 ++
77
block/io_uring.c | 4 +
78
block/linux-aio.c | 3 +
79
block/qapi-sysemu.c | 11 +
80
block/qapi.c | 18 +
81
block/raw-format.c | 26 +
82
hw/block/virtio-blk-common.c | 2 +
83
hw/block/virtio-blk.c | 405 +++++++++++
84
hw/virtio/virtio-qmp.c | 2 +
85
qemu-io-cmds.c | 224 ++++++
86
block/trace-events | 4 +
87
docs/system/qemu-block-drivers.rst.inc | 6 +
88
hw/block/trace-events | 7 +
89
scripts/tracetool/backend/ftrace.py | 4 +-
90
scripts/tracetool/backend/log.py | 4 +-
91
scripts/tracetool/backend/syslog.py | 4 +-
92
tests/qemu-iotests/tests/zoned | 105 +++
93
tests/qemu-iotests/tests/zoned.out | 69 ++
94
trace/meson.build | 2 +-
95
46 files changed, 2353 insertions(+), 81 deletions(-)
96
create mode 100644 docs/devel/zoned-storage.rst
97
create mode 100755 tests/qemu-iotests/tests/zoned
98
create mode 100644 tests/qemu-iotests/tests/zoned.out
35
99
36
--
100
--
37
2.35.1
101
2.39.2
102
103
diff view generated by jsdifflib
New patch
1
From: Sam Li <faithilikerun@gmail.com>
1
2
3
Signed-off-by: Sam Li <faithilikerun@gmail.com>
4
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
5
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
6
Reviewed-by: Hannes Reinecke <hare@suse.de>
7
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
8
Acked-by: Kevin Wolf <kwolf@redhat.com>
9
Message-id: 20230324090605.28361-2-faithilikerun@gmail.com
10
[Adjust commit message prefix as suggested by Philippe Mathieu-Daudé
11
<philmd@linaro.org>.
12
--Stefan]
13
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
14
---
15
include/block/block-common.h | 43 ++++++++++++++++++++++++++++++++++++
16
1 file changed, 43 insertions(+)
17
18
diff --git a/include/block/block-common.h b/include/block/block-common.h
19
index XXXXXXX..XXXXXXX 100644
20
--- a/include/block/block-common.h
21
+++ b/include/block/block-common.h
22
@@ -XXX,XX +XXX,XX @@ typedef struct BlockDriver BlockDriver;
23
typedef struct BdrvChild BdrvChild;
24
typedef struct BdrvChildClass BdrvChildClass;
25
26
+typedef enum BlockZoneOp {
27
+ BLK_ZO_OPEN,
28
+ BLK_ZO_CLOSE,
29
+ BLK_ZO_FINISH,
30
+ BLK_ZO_RESET,
31
+} BlockZoneOp;
32
+
33
+typedef enum BlockZoneModel {
34
+ BLK_Z_NONE = 0x0, /* Regular block device */
35
+ BLK_Z_HM = 0x1, /* Host-managed zoned block device */
36
+ BLK_Z_HA = 0x2, /* Host-aware zoned block device */
37
+} BlockZoneModel;
38
+
39
+typedef enum BlockZoneState {
40
+ BLK_ZS_NOT_WP = 0x0,
41
+ BLK_ZS_EMPTY = 0x1,
42
+ BLK_ZS_IOPEN = 0x2,
43
+ BLK_ZS_EOPEN = 0x3,
44
+ BLK_ZS_CLOSED = 0x4,
45
+ BLK_ZS_RDONLY = 0xD,
46
+ BLK_ZS_FULL = 0xE,
47
+ BLK_ZS_OFFLINE = 0xF,
48
+} BlockZoneState;
49
+
50
+typedef enum BlockZoneType {
51
+ BLK_ZT_CONV = 0x1, /* Conventional random writes supported */
52
+ BLK_ZT_SWR = 0x2, /* Sequential writes required */
53
+ BLK_ZT_SWP = 0x3, /* Sequential writes preferred */
54
+} BlockZoneType;
55
+
56
+/*
57
+ * Zone descriptor data structure.
58
+ * Provides information on a zone with all position and size values in bytes.
59
+ */
60
+typedef struct BlockZoneDescriptor {
61
+ uint64_t start;
62
+ uint64_t length;
63
+ uint64_t cap;
64
+ uint64_t wp;
65
+ BlockZoneType type;
66
+ BlockZoneState state;
67
+} BlockZoneDescriptor;
68
+
69
typedef struct BlockDriverInfo {
70
/* in bytes, 0 if irrelevant */
71
int cluster_size;
72
--
73
2.39.2
74
75
diff view generated by jsdifflib
New patch
1
1
From: Sam Li <faithilikerun@gmail.com>
2
3
Use get_sysfs_str_val() to get the string value of device
4
zoned model. Then get_sysfs_zoned_model() can convert it to
5
BlockZoneModel type of QEMU.
6
7
Use get_sysfs_long_val() to get the long value of zoned device
8
information.
9
10
Signed-off-by: Sam Li <faithilikerun@gmail.com>
11
Reviewed-by: Hannes Reinecke <hare@suse.de>
12
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
13
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
14
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
15
Acked-by: Kevin Wolf <kwolf@redhat.com>
16
Message-id: 20230324090605.28361-3-faithilikerun@gmail.com
17
[Adjust commit message prefix as suggested by Philippe Mathieu-Daudé
18
<philmd@linaro.org>.
19
--Stefan]
20
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
21
---
22
include/block/block_int-common.h | 3 +
23
block/file-posix.c | 130 ++++++++++++++++++++++---------
24
2 files changed, 95 insertions(+), 38 deletions(-)
25
26
diff --git a/include/block/block_int-common.h b/include/block/block_int-common.h
27
index XXXXXXX..XXXXXXX 100644
28
--- a/include/block/block_int-common.h
29
+++ b/include/block/block_int-common.h
30
@@ -XXX,XX +XXX,XX @@ typedef struct BlockLimits {
31
* an explicit monitor command to load the disk inside the guest).
32
*/
33
bool has_variable_length;
34
+
35
+ /* device zone model */
36
+ BlockZoneModel zoned;
37
} BlockLimits;
38
39
typedef struct BdrvOpBlocker BdrvOpBlocker;
40
diff --git a/block/file-posix.c b/block/file-posix.c
41
index XXXXXXX..XXXXXXX 100644
42
--- a/block/file-posix.c
43
+++ b/block/file-posix.c
44
@@ -XXX,XX +XXX,XX @@ static int hdev_get_max_hw_transfer(int fd, struct stat *st)
45
#endif
46
}
47
48
-static int hdev_get_max_segments(int fd, struct stat *st)
49
+/*
50
+ * Get a sysfs attribute value as character string.
51
+ */
52
+static int get_sysfs_str_val(struct stat *st, const char *attribute,
53
+ char **val) {
54
+#ifdef CONFIG_LINUX
55
+ g_autofree char *sysfspath = NULL;
56
+ int ret;
57
+ size_t len;
58
+
59
+ if (!S_ISBLK(st->st_mode)) {
60
+ return -ENOTSUP;
61
+ }
62
+
63
+ sysfspath = g_strdup_printf("/sys/dev/block/%u:%u/queue/%s",
64
+ major(st->st_rdev), minor(st->st_rdev),
65
+ attribute);
66
+ ret = g_file_get_contents(sysfspath, val, &len, NULL);
67
+ if (ret == -1) {
68
+ return -ENOENT;
69
+ }
70
+
71
+ /* The file is ended with '\n' */
72
+ char *p;
73
+ p = *val;
74
+ if (*(p + len - 1) == '\n') {
75
+ *(p + len - 1) = '\0';
76
+ }
77
+ return ret;
78
+#else
79
+ return -ENOTSUP;
80
+#endif
81
+}
82
+
83
+static int get_sysfs_zoned_model(struct stat *st, BlockZoneModel *zoned)
84
+{
85
+ g_autofree char *val = NULL;
86
+ int ret;
87
+
88
+ ret = get_sysfs_str_val(st, "zoned", &val);
89
+ if (ret < 0) {
90
+ return ret;
91
+ }
92
+
93
+ if (strcmp(val, "host-managed") == 0) {
94
+ *zoned = BLK_Z_HM;
95
+ } else if (strcmp(val, "host-aware") == 0) {
96
+ *zoned = BLK_Z_HA;
97
+ } else if (strcmp(val, "none") == 0) {
98
+ *zoned = BLK_Z_NONE;
99
+ } else {
100
+ return -ENOTSUP;
101
+ }
102
+ return 0;
103
+}
104
+
105
+/*
106
+ * Get a sysfs attribute value as a long integer.
107
+ */
108
+static long get_sysfs_long_val(struct stat *st, const char *attribute)
109
{
110
#ifdef CONFIG_LINUX
111
- char buf[32];
112
+ g_autofree char *str = NULL;
113
const char *end;
114
- char *sysfspath = NULL;
115
+ long val;
116
+ int ret;
117
+
118
+ ret = get_sysfs_str_val(st, attribute, &str);
119
+ if (ret < 0) {
120
+ return ret;
121
+ }
122
+
123
+ /* The file is ended with '\n', pass 'end' to accept that. */
124
+ ret = qemu_strtol(str, &end, 10, &val);
125
+ if (ret == 0 && end && *end == '\0') {
126
+ ret = val;
127
+ }
128
+ return ret;
129
+#else
130
+ return -ENOTSUP;
131
+#endif
132
+}
133
+
134
+static int hdev_get_max_segments(int fd, struct stat *st)
135
+{
136
+#ifdef CONFIG_LINUX
137
int ret;
138
- int sysfd = -1;
139
- long max_segments;
140
141
if (S_ISCHR(st->st_mode)) {
142
if (ioctl(fd, SG_GET_SG_TABLESIZE, &ret) == 0) {
143
@@ -XXX,XX +XXX,XX @@ static int hdev_get_max_segments(int fd, struct stat *st)
144
}
145
return -ENOTSUP;
146
}
147
-
148
- if (!S_ISBLK(st->st_mode)) {
149
- return -ENOTSUP;
150
- }
151
-
152
- sysfspath = g_strdup_printf("/sys/dev/block/%u:%u/queue/max_segments",
153
- major(st->st_rdev), minor(st->st_rdev));
154
- sysfd = open(sysfspath, O_RDONLY);
155
- if (sysfd == -1) {
156
- ret = -errno;
157
- goto out;
158
- }
159
- ret = RETRY_ON_EINTR(read(sysfd, buf, sizeof(buf) - 1));
160
- if (ret < 0) {
161
- ret = -errno;
162
- goto out;
163
- } else if (ret == 0) {
164
- ret = -EIO;
165
- goto out;
166
- }
167
- buf[ret] = 0;
168
- /* The file is ended with '\n', pass 'end' to accept that. */
169
- ret = qemu_strtol(buf, &end, 10, &max_segments);
170
- if (ret == 0 && end && *end == '\n') {
171
- ret = max_segments;
172
- }
173
-
174
-out:
175
- if (sysfd != -1) {
176
- close(sysfd);
177
- }
178
- g_free(sysfspath);
179
- return ret;
180
+ return get_sysfs_long_val(st, "max_segments");
181
#else
182
return -ENOTSUP;
183
#endif
184
@@ -XXX,XX +XXX,XX @@ static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
185
{
186
BDRVRawState *s = bs->opaque;
187
struct stat st;
188
+ int ret;
189
+ BlockZoneModel zoned;
190
191
s->needs_alignment = raw_needs_alignment(bs);
192
raw_probe_alignment(bs, s->fd, errp);
193
@@ -XXX,XX +XXX,XX @@ static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
194
bs->bl.max_hw_iov = ret;
195
}
196
}
197
+
198
+ ret = get_sysfs_zoned_model(&st, &zoned);
199
+ if (ret < 0) {
200
+ zoned = BLK_Z_NONE;
201
+ }
202
+ bs->bl.zoned = zoned;
203
}
204
205
static int check_for_dasd(int fd)
206
--
207
2.39.2
208
209
diff view generated by jsdifflib
New patch
1
From: Sam Li <faithilikerun@gmail.com>
1
2
3
Add zoned device option to host_device BlockDriver. It will be presented only
4
for zoned host block devices. By adding zone management operations to the
5
host_block_device BlockDriver, users can use the new block layer APIs
6
including Report Zone and four zone management operations
7
(open, close, finish, reset, reset_all).
8
9
Qemu-io uses the new APIs to perform zoned storage commands of the device:
10
zone_report(zrp), zone_open(zo), zone_close(zc), zone_reset(zrs),
11
zone_finish(zf).
12
13
For example, to test zone_report, use following command:
14
$ ./build/qemu-io --image-opts -n driver=host_device, filename=/dev/nullb0
15
-c "zrp offset nr_zones"
16
17
Signed-off-by: Sam Li <faithilikerun@gmail.com>
18
Reviewed-by: Hannes Reinecke <hare@suse.de>
19
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
20
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
21
Acked-by: Kevin Wolf <kwolf@redhat.com>
22
Message-id: 20230324090605.28361-4-faithilikerun@gmail.com
23
[Adjust commit message prefix as suggested by Philippe Mathieu-Daudé
24
<philmd@linaro.org> and remove spurious ret = -errno in
25
raw_co_zone_mgmt().
26
--Stefan]
27
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
28
---
29
meson.build | 4 +
30
include/block/block-io.h | 9 +
31
include/block/block_int-common.h | 21 ++
32
include/block/raw-aio.h | 6 +-
33
include/sysemu/block-backend-io.h | 18 ++
34
block/block-backend.c | 133 +++++++++++++
35
block/file-posix.c | 306 +++++++++++++++++++++++++++++-
36
block/io.c | 41 ++++
37
qemu-io-cmds.c | 149 +++++++++++++++
38
9 files changed, 684 insertions(+), 3 deletions(-)
39
40
diff --git a/meson.build b/meson.build
41
index XXXXXXX..XXXXXXX 100644
42
--- a/meson.build
43
+++ b/meson.build
44
@@ -XXX,XX +XXX,XX @@ config_host_data.set('CONFIG_REPLICATION', get_option('replication').allowed())
45
# has_header
46
config_host_data.set('CONFIG_EPOLL', cc.has_header('sys/epoll.h'))
47
config_host_data.set('CONFIG_LINUX_MAGIC_H', cc.has_header('linux/magic.h'))
48
+config_host_data.set('CONFIG_BLKZONED', cc.has_header('linux/blkzoned.h'))
49
config_host_data.set('CONFIG_VALGRIND_H', cc.has_header('valgrind/valgrind.h'))
50
config_host_data.set('HAVE_BTRFS_H', cc.has_header('linux/btrfs.h'))
51
config_host_data.set('HAVE_DRM_H', cc.has_header('libdrm/drm.h'))
52
@@ -XXX,XX +XXX,XX @@ config_host_data.set('HAVE_SIGEV_NOTIFY_THREAD_ID',
53
config_host_data.set('HAVE_STRUCT_STAT_ST_ATIM',
54
cc.has_member('struct stat', 'st_atim',
55
prefix: '#include <sys/stat.h>'))
56
+config_host_data.set('HAVE_BLK_ZONE_REP_CAPACITY',
57
+ cc.has_member('struct blk_zone', 'capacity',
58
+ prefix: '#include <linux/blkzoned.h>'))
59
60
# has_type
61
config_host_data.set('CONFIG_IOVEC',
62
diff --git a/include/block/block-io.h b/include/block/block-io.h
63
index XXXXXXX..XXXXXXX 100644
64
--- a/include/block/block-io.h
65
+++ b/include/block/block-io.h
66
@@ -XXX,XX +XXX,XX @@ int coroutine_fn GRAPH_RDLOCK bdrv_co_flush(BlockDriverState *bs);
67
int coroutine_fn GRAPH_RDLOCK bdrv_co_pdiscard(BdrvChild *child, int64_t offset,
68
int64_t bytes);
69
70
+/* Report zone information of zone block device. */
71
+int coroutine_fn GRAPH_RDLOCK bdrv_co_zone_report(BlockDriverState *bs,
72
+ int64_t offset,
73
+ unsigned int *nr_zones,
74
+ BlockZoneDescriptor *zones);
75
+int coroutine_fn GRAPH_RDLOCK bdrv_co_zone_mgmt(BlockDriverState *bs,
76
+ BlockZoneOp op,
77
+ int64_t offset, int64_t len);
78
+
79
bool bdrv_can_write_zeroes_with_unmap(BlockDriverState *bs);
80
int bdrv_block_status(BlockDriverState *bs, int64_t offset,
81
int64_t bytes, int64_t *pnum, int64_t *map,
82
diff --git a/include/block/block_int-common.h b/include/block/block_int-common.h
83
index XXXXXXX..XXXXXXX 100644
84
--- a/include/block/block_int-common.h
85
+++ b/include/block/block_int-common.h
86
@@ -XXX,XX +XXX,XX @@ struct BlockDriver {
87
int coroutine_fn GRAPH_RDLOCK_PTR (*bdrv_co_load_vmstate)(
88
BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos);
89
90
+ int coroutine_fn (*bdrv_co_zone_report)(BlockDriverState *bs,
91
+ int64_t offset, unsigned int *nr_zones,
92
+ BlockZoneDescriptor *zones);
93
+ int coroutine_fn (*bdrv_co_zone_mgmt)(BlockDriverState *bs, BlockZoneOp op,
94
+ int64_t offset, int64_t len);
95
+
96
/* removable device specific */
97
bool coroutine_fn GRAPH_RDLOCK_PTR (*bdrv_co_is_inserted)(
98
BlockDriverState *bs);
99
@@ -XXX,XX +XXX,XX @@ typedef struct BlockLimits {
100
101
/* device zone model */
102
BlockZoneModel zoned;
103
+
104
+ /* zone size expressed in bytes */
105
+ uint32_t zone_size;
106
+
107
+ /* total number of zones */
108
+ uint32_t nr_zones;
109
+
110
+ /* maximum sectors of a zone append write operation */
111
+ int64_t max_append_sectors;
112
+
113
+ /* maximum number of open zones */
114
+ int64_t max_open_zones;
115
+
116
+ /* maximum number of active zones */
117
+ int64_t max_active_zones;
118
} BlockLimits;
119
120
typedef struct BdrvOpBlocker BdrvOpBlocker;
121
diff --git a/include/block/raw-aio.h b/include/block/raw-aio.h
122
index XXXXXXX..XXXXXXX 100644
123
--- a/include/block/raw-aio.h
124
+++ b/include/block/raw-aio.h
125
@@ -XXX,XX +XXX,XX @@
126
#define QEMU_AIO_WRITE_ZEROES 0x0020
127
#define QEMU_AIO_COPY_RANGE 0x0040
128
#define QEMU_AIO_TRUNCATE 0x0080
129
+#define QEMU_AIO_ZONE_REPORT 0x0100
130
+#define QEMU_AIO_ZONE_MGMT 0x0200
131
#define QEMU_AIO_TYPE_MASK \
132
(QEMU_AIO_READ | \
133
QEMU_AIO_WRITE | \
134
@@ -XXX,XX +XXX,XX @@
135
QEMU_AIO_DISCARD | \
136
QEMU_AIO_WRITE_ZEROES | \
137
QEMU_AIO_COPY_RANGE | \
138
- QEMU_AIO_TRUNCATE)
139
+ QEMU_AIO_TRUNCATE | \
140
+ QEMU_AIO_ZONE_REPORT | \
141
+ QEMU_AIO_ZONE_MGMT)
142
143
/* AIO flags */
144
#define QEMU_AIO_MISALIGNED 0x1000
145
diff --git a/include/sysemu/block-backend-io.h b/include/sysemu/block-backend-io.h
146
index XXXXXXX..XXXXXXX 100644
147
--- a/include/sysemu/block-backend-io.h
148
+++ b/include/sysemu/block-backend-io.h
149
@@ -XXX,XX +XXX,XX @@ BlockAIOCB *blk_aio_pwritev(BlockBackend *blk, int64_t offset,
150
BlockCompletionFunc *cb, void *opaque);
151
BlockAIOCB *blk_aio_flush(BlockBackend *blk,
152
BlockCompletionFunc *cb, void *opaque);
153
+BlockAIOCB *blk_aio_zone_report(BlockBackend *blk, int64_t offset,
154
+ unsigned int *nr_zones,
155
+ BlockZoneDescriptor *zones,
156
+ BlockCompletionFunc *cb, void *opaque);
157
+BlockAIOCB *blk_aio_zone_mgmt(BlockBackend *blk, BlockZoneOp op,
158
+ int64_t offset, int64_t len,
159
+ BlockCompletionFunc *cb, void *opaque);
160
BlockAIOCB *blk_aio_pdiscard(BlockBackend *blk, int64_t offset, int64_t bytes,
161
BlockCompletionFunc *cb, void *opaque);
162
void blk_aio_cancel_async(BlockAIOCB *acb);
163
@@ -XXX,XX +XXX,XX @@ int co_wrapper_mixed blk_pwrite_zeroes(BlockBackend *blk, int64_t offset,
164
int coroutine_fn blk_co_pwrite_zeroes(BlockBackend *blk, int64_t offset,
165
int64_t bytes, BdrvRequestFlags flags);
166
167
+int coroutine_fn blk_co_zone_report(BlockBackend *blk, int64_t offset,
168
+ unsigned int *nr_zones,
169
+ BlockZoneDescriptor *zones);
170
+int co_wrapper_mixed blk_zone_report(BlockBackend *blk, int64_t offset,
171
+ unsigned int *nr_zones,
172
+ BlockZoneDescriptor *zones);
173
+int coroutine_fn blk_co_zone_mgmt(BlockBackend *blk, BlockZoneOp op,
174
+ int64_t offset, int64_t len);
175
+int co_wrapper_mixed blk_zone_mgmt(BlockBackend *blk, BlockZoneOp op,
176
+ int64_t offset, int64_t len);
177
+
178
int co_wrapper_mixed blk_pdiscard(BlockBackend *blk, int64_t offset,
179
int64_t bytes);
180
int coroutine_fn blk_co_pdiscard(BlockBackend *blk, int64_t offset,
181
diff --git a/block/block-backend.c b/block/block-backend.c
182
index XXXXXXX..XXXXXXX 100644
183
--- a/block/block-backend.c
184
+++ b/block/block-backend.c
185
@@ -XXX,XX +XXX,XX @@ int coroutine_fn blk_co_flush(BlockBackend *blk)
186
return ret;
187
}
188
189
+static void coroutine_fn blk_aio_zone_report_entry(void *opaque)
190
+{
191
+ BlkAioEmAIOCB *acb = opaque;
192
+ BlkRwCo *rwco = &acb->rwco;
193
+
194
+ rwco->ret = blk_co_zone_report(rwco->blk, rwco->offset,
195
+ (unsigned int*)acb->bytes,rwco->iobuf);
196
+ blk_aio_complete(acb);
197
+}
198
+
199
+BlockAIOCB *blk_aio_zone_report(BlockBackend *blk, int64_t offset,
200
+ unsigned int *nr_zones,
201
+ BlockZoneDescriptor *zones,
202
+ BlockCompletionFunc *cb, void *opaque)
203
+{
204
+ BlkAioEmAIOCB *acb;
205
+ Coroutine *co;
206
+ IO_CODE();
207
+
208
+ blk_inc_in_flight(blk);
209
+ acb = blk_aio_get(&blk_aio_em_aiocb_info, blk, cb, opaque);
210
+ acb->rwco = (BlkRwCo) {
211
+ .blk = blk,
212
+ .offset = offset,
213
+ .iobuf = zones,
214
+ .ret = NOT_DONE,
215
+ };
216
+ acb->bytes = (int64_t)nr_zones,
217
+ acb->has_returned = false;
218
+
219
+ co = qemu_coroutine_create(blk_aio_zone_report_entry, acb);
220
+ aio_co_enter(blk_get_aio_context(blk), co);
221
+
222
+ acb->has_returned = true;
223
+ if (acb->rwco.ret != NOT_DONE) {
224
+ replay_bh_schedule_oneshot_event(blk_get_aio_context(blk),
225
+ blk_aio_complete_bh, acb);
226
+ }
227
+
228
+ return &acb->common;
229
+}
230
+
231
+static void coroutine_fn blk_aio_zone_mgmt_entry(void *opaque)
232
+{
233
+ BlkAioEmAIOCB *acb = opaque;
234
+ BlkRwCo *rwco = &acb->rwco;
235
+
236
+ rwco->ret = blk_co_zone_mgmt(rwco->blk, (BlockZoneOp)rwco->iobuf,
237
+ rwco->offset, acb->bytes);
238
+ blk_aio_complete(acb);
239
+}
240
+
241
+BlockAIOCB *blk_aio_zone_mgmt(BlockBackend *blk, BlockZoneOp op,
242
+ int64_t offset, int64_t len,
243
+ BlockCompletionFunc *cb, void *opaque) {
244
+ BlkAioEmAIOCB *acb;
245
+ Coroutine *co;
246
+ IO_CODE();
247
+
248
+ blk_inc_in_flight(blk);
249
+ acb = blk_aio_get(&blk_aio_em_aiocb_info, blk, cb, opaque);
250
+ acb->rwco = (BlkRwCo) {
251
+ .blk = blk,
252
+ .offset = offset,
253
+ .iobuf = (void *)op,
254
+ .ret = NOT_DONE,
255
+ };
256
+ acb->bytes = len;
257
+ acb->has_returned = false;
258
+
259
+ co = qemu_coroutine_create(blk_aio_zone_mgmt_entry, acb);
260
+ aio_co_enter(blk_get_aio_context(blk), co);
261
+
262
+ acb->has_returned = true;
263
+ if (acb->rwco.ret != NOT_DONE) {
264
+ replay_bh_schedule_oneshot_event(blk_get_aio_context(blk),
265
+ blk_aio_complete_bh, acb);
266
+ }
267
+
268
+ return &acb->common;
269
+}
270
+
271
+/*
272
+ * Send a zone_report command.
273
+ * offset is a byte offset from the start of the device. No alignment
274
+ * required for offset.
275
+ * nr_zones represents IN maximum and OUT actual.
276
+ */
277
+int coroutine_fn blk_co_zone_report(BlockBackend *blk, int64_t offset,
278
+ unsigned int *nr_zones,
279
+ BlockZoneDescriptor *zones)
280
+{
281
+ int ret;
282
+ IO_CODE();
283
+
284
+ blk_inc_in_flight(blk); /* increase before waiting */
285
+ blk_wait_while_drained(blk);
286
+ if (!blk_is_available(blk)) {
287
+ blk_dec_in_flight(blk);
288
+ return -ENOMEDIUM;
289
+ }
290
+ ret = bdrv_co_zone_report(blk_bs(blk), offset, nr_zones, zones);
291
+ blk_dec_in_flight(blk);
292
+ return ret;
293
+}
294
+
295
+/*
296
+ * Send a zone_management command.
297
+ * op is the zone operation;
298
+ * offset is the byte offset from the start of the zoned device;
299
+ * len is the maximum number of bytes the command should operate on. It
300
+ * should be aligned with the device zone size.
301
+ */
302
+int coroutine_fn blk_co_zone_mgmt(BlockBackend *blk, BlockZoneOp op,
303
+ int64_t offset, int64_t len)
304
+{
305
+ int ret;
306
+ IO_CODE();
307
+
308
+ blk_inc_in_flight(blk);
309
+ blk_wait_while_drained(blk);
310
+
311
+ ret = blk_check_byte_request(blk, offset, len);
312
+ if (ret < 0) {
313
+ blk_dec_in_flight(blk);
314
+ return ret;
315
+ }
316
+
317
+ ret = bdrv_co_zone_mgmt(blk_bs(blk), op, offset, len);
318
+ blk_dec_in_flight(blk);
319
+ return ret;
320
+}
321
+
322
void blk_drain(BlockBackend *blk)
323
{
324
BlockDriverState *bs = blk_bs(blk);
325
diff --git a/block/file-posix.c b/block/file-posix.c
326
index XXXXXXX..XXXXXXX 100644
327
--- a/block/file-posix.c
328
+++ b/block/file-posix.c
329
@@ -XXX,XX +XXX,XX @@
330
#include <sys/param.h>
331
#include <sys/syscall.h>
332
#include <sys/vfs.h>
333
+#if defined(CONFIG_BLKZONED)
334
+#include <linux/blkzoned.h>
335
+#endif
336
#include <linux/cdrom.h>
337
#include <linux/fd.h>
338
#include <linux/fs.h>
339
@@ -XXX,XX +XXX,XX @@ typedef struct RawPosixAIOData {
340
PreallocMode prealloc;
341
Error **errp;
342
} truncate;
343
+ struct {
344
+ unsigned int *nr_zones;
345
+ BlockZoneDescriptor *zones;
346
+ } zone_report;
347
+ struct {
348
+ unsigned long op;
349
+ } zone_mgmt;
350
};
351
} RawPosixAIOData;
352
353
@@ -XXX,XX +XXX,XX @@ static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
354
zoned = BLK_Z_NONE;
355
}
356
bs->bl.zoned = zoned;
357
+ if (zoned != BLK_Z_NONE) {
358
+ /*
359
+ * The zoned device must at least have zone size and nr_zones fields.
360
+ */
361
+ ret = get_sysfs_long_val(&st, "chunk_sectors");
362
+ if (ret < 0) {
363
+ error_setg_errno(errp, -ret, "Unable to read chunk_sectors "
364
+ "sysfs attribute");
365
+ goto out;
366
+ } else if (!ret) {
367
+ error_setg(errp, "Read 0 from chunk_sectors sysfs attribute");
368
+ goto out;
369
+ }
370
+ bs->bl.zone_size = ret << BDRV_SECTOR_BITS;
371
+
372
+ ret = get_sysfs_long_val(&st, "nr_zones");
373
+ if (ret < 0) {
374
+ error_setg_errno(errp, -ret, "Unable to read nr_zones "
375
+ "sysfs attribute");
376
+ goto out;
377
+ } else if (!ret) {
378
+ error_setg(errp, "Read 0 from nr_zones sysfs attribute");
379
+ goto out;
380
+ }
381
+ bs->bl.nr_zones = ret;
382
+
383
+ ret = get_sysfs_long_val(&st, "zone_append_max_bytes");
384
+ if (ret > 0) {
385
+ bs->bl.max_append_sectors = ret >> BDRV_SECTOR_BITS;
386
+ }
387
+
388
+ ret = get_sysfs_long_val(&st, "max_open_zones");
389
+ if (ret >= 0) {
390
+ bs->bl.max_open_zones = ret;
391
+ }
392
+
393
+ ret = get_sysfs_long_val(&st, "max_active_zones");
394
+ if (ret >= 0) {
395
+ bs->bl.max_active_zones = ret;
396
+ }
397
+ return;
398
+ }
399
+out:
400
+ bs->bl.zoned = BLK_Z_NONE;
401
}
402
403
static int check_for_dasd(int fd)
404
@@ -XXX,XX +XXX,XX @@ static int hdev_probe_blocksizes(BlockDriverState *bs, BlockSizes *bsz)
405
BDRVRawState *s = bs->opaque;
406
int ret;
407
408
- /* If DASD, get blocksizes */
409
+ /* If DASD or zoned devices, get blocksizes */
410
if (check_for_dasd(s->fd) < 0) {
411
- return -ENOTSUP;
412
+ /* zoned devices are not DASD */
413
+ if (bs->bl.zoned == BLK_Z_NONE) {
414
+ return -ENOTSUP;
415
+ }
416
}
417
ret = probe_logical_blocksize(s->fd, &bsz->log);
418
if (ret < 0) {
419
@@ -XXX,XX +XXX,XX @@ static off_t copy_file_range(int in_fd, off_t *in_off, int out_fd,
420
}
421
#endif
422
423
+/*
424
+ * parse_zone - Fill a zone descriptor
425
+ */
426
+#if defined(CONFIG_BLKZONED)
427
+static inline int parse_zone(struct BlockZoneDescriptor *zone,
428
+ const struct blk_zone *blkz) {
429
+ zone->start = blkz->start << BDRV_SECTOR_BITS;
430
+ zone->length = blkz->len << BDRV_SECTOR_BITS;
431
+ zone->wp = blkz->wp << BDRV_SECTOR_BITS;
432
+
433
+#ifdef HAVE_BLK_ZONE_REP_CAPACITY
434
+ zone->cap = blkz->capacity << BDRV_SECTOR_BITS;
435
+#else
436
+ zone->cap = blkz->len << BDRV_SECTOR_BITS;
437
+#endif
438
+
439
+ switch (blkz->type) {
440
+ case BLK_ZONE_TYPE_SEQWRITE_REQ:
441
+ zone->type = BLK_ZT_SWR;
442
+ break;
443
+ case BLK_ZONE_TYPE_SEQWRITE_PREF:
444
+ zone->type = BLK_ZT_SWP;
445
+ break;
446
+ case BLK_ZONE_TYPE_CONVENTIONAL:
447
+ zone->type = BLK_ZT_CONV;
448
+ break;
449
+ default:
450
+ error_report("Unsupported zone type: 0x%x", blkz->type);
451
+ return -ENOTSUP;
452
+ }
453
+
454
+ switch (blkz->cond) {
455
+ case BLK_ZONE_COND_NOT_WP:
456
+ zone->state = BLK_ZS_NOT_WP;
457
+ break;
458
+ case BLK_ZONE_COND_EMPTY:
459
+ zone->state = BLK_ZS_EMPTY;
460
+ break;
461
+ case BLK_ZONE_COND_IMP_OPEN:
462
+ zone->state = BLK_ZS_IOPEN;
463
+ break;
464
+ case BLK_ZONE_COND_EXP_OPEN:
465
+ zone->state = BLK_ZS_EOPEN;
466
+ break;
467
+ case BLK_ZONE_COND_CLOSED:
468
+ zone->state = BLK_ZS_CLOSED;
469
+ break;
470
+ case BLK_ZONE_COND_READONLY:
471
+ zone->state = BLK_ZS_RDONLY;
472
+ break;
473
+ case BLK_ZONE_COND_FULL:
474
+ zone->state = BLK_ZS_FULL;
475
+ break;
476
+ case BLK_ZONE_COND_OFFLINE:
477
+ zone->state = BLK_ZS_OFFLINE;
478
+ break;
479
+ default:
480
+ error_report("Unsupported zone state: 0x%x", blkz->cond);
481
+ return -ENOTSUP;
482
+ }
483
+ return 0;
484
+}
485
+#endif
486
+
487
+#if defined(CONFIG_BLKZONED)
488
+static int handle_aiocb_zone_report(void *opaque)
489
+{
490
+ RawPosixAIOData *aiocb = opaque;
491
+ int fd = aiocb->aio_fildes;
492
+ unsigned int *nr_zones = aiocb->zone_report.nr_zones;
493
+ BlockZoneDescriptor *zones = aiocb->zone_report.zones;
494
+ /* zoned block devices use 512-byte sectors */
495
+ uint64_t sector = aiocb->aio_offset / 512;
496
+
497
+ struct blk_zone *blkz;
498
+ size_t rep_size;
499
+ unsigned int nrz;
500
+ int ret, n = 0, i = 0;
501
+
502
+ nrz = *nr_zones;
503
+ rep_size = sizeof(struct blk_zone_report) + nrz * sizeof(struct blk_zone);
504
+ g_autofree struct blk_zone_report *rep = NULL;
505
+ rep = g_malloc(rep_size);
506
+
507
+ blkz = (struct blk_zone *)(rep + 1);
508
+ while (n < nrz) {
509
+ memset(rep, 0, rep_size);
510
+ rep->sector = sector;
511
+ rep->nr_zones = nrz - n;
512
+
513
+ do {
514
+ ret = ioctl(fd, BLKREPORTZONE, rep);
515
+ } while (ret != 0 && errno == EINTR);
516
+ if (ret != 0) {
517
+ error_report("%d: ioctl BLKREPORTZONE at %" PRId64 " failed %d",
518
+ fd, sector, errno);
519
+ return -errno;
520
+ }
521
+
522
+ if (!rep->nr_zones) {
523
+ break;
524
+ }
525
+
526
+ for (i = 0; i < rep->nr_zones; i++, n++) {
527
+ ret = parse_zone(&zones[n], &blkz[i]);
528
+ if (ret != 0) {
529
+ return ret;
530
+ }
531
+
532
+ /* The next report should start after the last zone reported */
533
+ sector = blkz[i].start + blkz[i].len;
534
+ }
535
+ }
536
+
537
+ *nr_zones = n;
538
+ return 0;
539
+}
540
+#endif
541
+
542
+#if defined(CONFIG_BLKZONED)
543
+static int handle_aiocb_zone_mgmt(void *opaque)
544
+{
545
+ RawPosixAIOData *aiocb = opaque;
546
+ int fd = aiocb->aio_fildes;
547
+ uint64_t sector = aiocb->aio_offset / 512;
548
+ int64_t nr_sectors = aiocb->aio_nbytes / 512;
549
+ struct blk_zone_range range;
550
+ int ret;
551
+
552
+ /* Execute the operation */
553
+ range.sector = sector;
554
+ range.nr_sectors = nr_sectors;
555
+ do {
556
+ ret = ioctl(fd, aiocb->zone_mgmt.op, &range);
557
+ } while (ret != 0 && errno == EINTR);
558
+
559
+ return ret;
560
+}
561
+#endif
562
+
563
static int handle_aiocb_copy_range(void *opaque)
564
{
565
RawPosixAIOData *aiocb = opaque;
566
@@ -XXX,XX +XXX,XX @@ static void raw_account_discard(BDRVRawState *s, uint64_t nbytes, int ret)
567
}
568
}
569
570
+/*
571
+ * zone report - Get a zone block device's information in the form
572
+ * of an array of zone descriptors.
573
+ * zones is an array of zone descriptors to hold zone information on reply;
574
+ * offset can be any byte within the entire size of the device;
575
+ * nr_zones is the maxium number of sectors the command should operate on.
576
+ */
577
+#if defined(CONFIG_BLKZONED)
578
+static int coroutine_fn raw_co_zone_report(BlockDriverState *bs, int64_t offset,
579
+ unsigned int *nr_zones,
580
+ BlockZoneDescriptor *zones) {
581
+ BDRVRawState *s = bs->opaque;
582
+ RawPosixAIOData acb = (RawPosixAIOData) {
583
+ .bs = bs,
584
+ .aio_fildes = s->fd,
585
+ .aio_type = QEMU_AIO_ZONE_REPORT,
586
+ .aio_offset = offset,
587
+ .zone_report = {
588
+ .nr_zones = nr_zones,
589
+ .zones = zones,
590
+ },
591
+ };
592
+
593
+ return raw_thread_pool_submit(bs, handle_aiocb_zone_report, &acb);
594
+}
595
+#endif
596
+
597
+/*
598
+ * zone management operations - Execute an operation on a zone
599
+ */
600
+#if defined(CONFIG_BLKZONED)
601
+static int coroutine_fn raw_co_zone_mgmt(BlockDriverState *bs, BlockZoneOp op,
602
+ int64_t offset, int64_t len) {
603
+ BDRVRawState *s = bs->opaque;
604
+ RawPosixAIOData acb;
605
+ int64_t zone_size, zone_size_mask;
606
+ const char *op_name;
607
+ unsigned long zo;
608
+ int ret;
609
+ int64_t capacity = bs->total_sectors << BDRV_SECTOR_BITS;
610
+
611
+ zone_size = bs->bl.zone_size;
612
+ zone_size_mask = zone_size - 1;
613
+ if (offset & zone_size_mask) {
614
+ error_report("sector offset %" PRId64 " is not aligned to zone size "
615
+ "%" PRId64 "", offset / 512, zone_size / 512);
616
+ return -EINVAL;
617
+ }
618
+
619
+ if (((offset + len) < capacity && len & zone_size_mask) ||
620
+ offset + len > capacity) {
621
+ error_report("number of sectors %" PRId64 " is not aligned to zone size"
622
+ " %" PRId64 "", len / 512, zone_size / 512);
623
+ return -EINVAL;
624
+ }
625
+
626
+ switch (op) {
627
+ case BLK_ZO_OPEN:
628
+ op_name = "BLKOPENZONE";
629
+ zo = BLKOPENZONE;
630
+ break;
631
+ case BLK_ZO_CLOSE:
632
+ op_name = "BLKCLOSEZONE";
633
+ zo = BLKCLOSEZONE;
634
+ break;
635
+ case BLK_ZO_FINISH:
636
+ op_name = "BLKFINISHZONE";
637
+ zo = BLKFINISHZONE;
638
+ break;
639
+ case BLK_ZO_RESET:
640
+ op_name = "BLKRESETZONE";
641
+ zo = BLKRESETZONE;
642
+ break;
643
+ default:
644
+ error_report("Unsupported zone op: 0x%x", op);
645
+ return -ENOTSUP;
646
+ }
647
+
648
+ acb = (RawPosixAIOData) {
649
+ .bs = bs,
650
+ .aio_fildes = s->fd,
651
+ .aio_type = QEMU_AIO_ZONE_MGMT,
652
+ .aio_offset = offset,
653
+ .aio_nbytes = len,
654
+ .zone_mgmt = {
655
+ .op = zo,
656
+ },
657
+ };
658
+
659
+ ret = raw_thread_pool_submit(bs, handle_aiocb_zone_mgmt, &acb);
660
+ if (ret != 0) {
661
+ error_report("ioctl %s failed %d", op_name, ret);
662
+ }
663
+
664
+ return ret;
665
+}
666
+#endif
667
+
668
static coroutine_fn int
669
raw_do_pdiscard(BlockDriverState *bs, int64_t offset, int64_t bytes,
670
bool blkdev)
671
@@ -XXX,XX +XXX,XX @@ static BlockDriver bdrv_host_device = {
672
#ifdef __linux__
673
.bdrv_co_ioctl = hdev_co_ioctl,
674
#endif
675
+
676
+ /* zoned device */
677
+#if defined(CONFIG_BLKZONED)
678
+ /* zone management operations */
679
+ .bdrv_co_zone_report = raw_co_zone_report,
680
+ .bdrv_co_zone_mgmt = raw_co_zone_mgmt,
681
+#endif
682
};
683
684
#if defined(__linux__) || defined(__FreeBSD__) || defined(__FreeBSD_kernel__)
685
diff --git a/block/io.c b/block/io.c
686
index XXXXXXX..XXXXXXX 100644
687
--- a/block/io.c
688
+++ b/block/io.c
689
@@ -XXX,XX +XXX,XX @@ out:
690
return co.ret;
691
}
692
693
+int coroutine_fn bdrv_co_zone_report(BlockDriverState *bs, int64_t offset,
694
+ unsigned int *nr_zones,
695
+ BlockZoneDescriptor *zones)
696
+{
697
+ BlockDriver *drv = bs->drv;
698
+ CoroutineIOCompletion co = {
699
+ .coroutine = qemu_coroutine_self(),
700
+ };
701
+ IO_CODE();
702
+
703
+ bdrv_inc_in_flight(bs);
704
+ if (!drv || !drv->bdrv_co_zone_report || bs->bl.zoned == BLK_Z_NONE) {
705
+ co.ret = -ENOTSUP;
706
+ goto out;
707
+ }
708
+ co.ret = drv->bdrv_co_zone_report(bs, offset, nr_zones, zones);
709
+out:
710
+ bdrv_dec_in_flight(bs);
711
+ return co.ret;
712
+}
713
+
714
+int coroutine_fn bdrv_co_zone_mgmt(BlockDriverState *bs, BlockZoneOp op,
715
+ int64_t offset, int64_t len)
716
+{
717
+ BlockDriver *drv = bs->drv;
718
+ CoroutineIOCompletion co = {
719
+ .coroutine = qemu_coroutine_self(),
720
+ };
721
+ IO_CODE();
722
+
723
+ bdrv_inc_in_flight(bs);
724
+ if (!drv || !drv->bdrv_co_zone_mgmt || bs->bl.zoned == BLK_Z_NONE) {
725
+ co.ret = -ENOTSUP;
726
+ goto out;
727
+ }
728
+ co.ret = drv->bdrv_co_zone_mgmt(bs, op, offset, len);
729
+out:
730
+ bdrv_dec_in_flight(bs);
731
+ return co.ret;
732
+}
733
+
734
void *qemu_blockalign(BlockDriverState *bs, size_t size)
735
{
736
IO_CODE();
737
diff --git a/qemu-io-cmds.c b/qemu-io-cmds.c
738
index XXXXXXX..XXXXXXX 100644
739
--- a/qemu-io-cmds.c
740
+++ b/qemu-io-cmds.c
741
@@ -XXX,XX +XXX,XX @@ static const cmdinfo_t flush_cmd = {
742
.oneline = "flush all in-core file state to disk",
743
};
744
745
+static inline int64_t tosector(int64_t bytes)
746
+{
747
+ return bytes >> BDRV_SECTOR_BITS;
748
+}
749
+
750
+static int zone_report_f(BlockBackend *blk, int argc, char **argv)
751
+{
752
+ int ret;
753
+ int64_t offset;
754
+ unsigned int nr_zones;
755
+
756
+ ++optind;
757
+ offset = cvtnum(argv[optind]);
758
+ ++optind;
759
+ nr_zones = cvtnum(argv[optind]);
760
+
761
+ g_autofree BlockZoneDescriptor *zones = NULL;
762
+ zones = g_new(BlockZoneDescriptor, nr_zones);
763
+ ret = blk_zone_report(blk, offset, &nr_zones, zones);
764
+ if (ret < 0) {
765
+ printf("zone report failed: %s\n", strerror(-ret));
766
+ } else {
767
+ for (int i = 0; i < nr_zones; ++i) {
768
+ printf("start: 0x%" PRIx64 ", len 0x%" PRIx64 ", "
769
+ "cap"" 0x%" PRIx64 ", wptr 0x%" PRIx64 ", "
770
+ "zcond:%u, [type: %u]\n",
771
+ tosector(zones[i].start), tosector(zones[i].length),
772
+ tosector(zones[i].cap), tosector(zones[i].wp),
773
+ zones[i].state, zones[i].type);
774
+ }
775
+ }
776
+ return ret;
777
+}
778
+
779
+static const cmdinfo_t zone_report_cmd = {
780
+ .name = "zone_report",
781
+ .altname = "zrp",
782
+ .cfunc = zone_report_f,
783
+ .argmin = 2,
784
+ .argmax = 2,
785
+ .args = "offset number",
786
+ .oneline = "report zone information",
787
+};
788
+
789
+static int zone_open_f(BlockBackend *blk, int argc, char **argv)
790
+{
791
+ int ret;
792
+ int64_t offset, len;
793
+ ++optind;
794
+ offset = cvtnum(argv[optind]);
795
+ ++optind;
796
+ len = cvtnum(argv[optind]);
797
+ ret = blk_zone_mgmt(blk, BLK_ZO_OPEN, offset, len);
798
+ if (ret < 0) {
799
+ printf("zone open failed: %s\n", strerror(-ret));
800
+ }
801
+ return ret;
802
+}
803
+
804
+static const cmdinfo_t zone_open_cmd = {
805
+ .name = "zone_open",
806
+ .altname = "zo",
807
+ .cfunc = zone_open_f,
808
+ .argmin = 2,
809
+ .argmax = 2,
810
+ .args = "offset len",
811
+ .oneline = "explicit open a range of zones in zone block device",
812
+};
813
+
814
+static int zone_close_f(BlockBackend *blk, int argc, char **argv)
815
+{
816
+ int ret;
817
+ int64_t offset, len;
818
+ ++optind;
819
+ offset = cvtnum(argv[optind]);
820
+ ++optind;
821
+ len = cvtnum(argv[optind]);
822
+ ret = blk_zone_mgmt(blk, BLK_ZO_CLOSE, offset, len);
823
+ if (ret < 0) {
824
+ printf("zone close failed: %s\n", strerror(-ret));
825
+ }
826
+ return ret;
827
+}
828
+
829
+static const cmdinfo_t zone_close_cmd = {
830
+ .name = "zone_close",
831
+ .altname = "zc",
832
+ .cfunc = zone_close_f,
833
+ .argmin = 2,
834
+ .argmax = 2,
835
+ .args = "offset len",
836
+ .oneline = "close a range of zones in zone block device",
837
+};
838
+
839
+static int zone_finish_f(BlockBackend *blk, int argc, char **argv)
840
+{
841
+ int ret;
842
+ int64_t offset, len;
843
+ ++optind;
844
+ offset = cvtnum(argv[optind]);
845
+ ++optind;
846
+ len = cvtnum(argv[optind]);
847
+ ret = blk_zone_mgmt(blk, BLK_ZO_FINISH, offset, len);
848
+ if (ret < 0) {
849
+ printf("zone finish failed: %s\n", strerror(-ret));
850
+ }
851
+ return ret;
852
+}
853
+
854
+static const cmdinfo_t zone_finish_cmd = {
855
+ .name = "zone_finish",
856
+ .altname = "zf",
857
+ .cfunc = zone_finish_f,
858
+ .argmin = 2,
859
+ .argmax = 2,
860
+ .args = "offset len",
861
+ .oneline = "finish a range of zones in zone block device",
862
+};
863
+
864
+static int zone_reset_f(BlockBackend *blk, int argc, char **argv)
865
+{
866
+ int ret;
867
+ int64_t offset, len;
868
+ ++optind;
869
+ offset = cvtnum(argv[optind]);
870
+ ++optind;
871
+ len = cvtnum(argv[optind]);
872
+ ret = blk_zone_mgmt(blk, BLK_ZO_RESET, offset, len);
873
+ if (ret < 0) {
874
+ printf("zone reset failed: %s\n", strerror(-ret));
875
+ }
876
+ return ret;
877
+}
878
+
879
+static const cmdinfo_t zone_reset_cmd = {
880
+ .name = "zone_reset",
881
+ .altname = "zrs",
882
+ .cfunc = zone_reset_f,
883
+ .argmin = 2,
884
+ .argmax = 2,
885
+ .args = "offset len",
886
+ .oneline = "reset a zone write pointer in zone block device",
887
+};
888
+
889
static int truncate_f(BlockBackend *blk, int argc, char **argv);
890
static const cmdinfo_t truncate_cmd = {
891
.name = "truncate",
892
@@ -XXX,XX +XXX,XX @@ static void __attribute((constructor)) init_qemuio_commands(void)
893
qemuio_add_command(&aio_write_cmd);
894
qemuio_add_command(&aio_flush_cmd);
895
qemuio_add_command(&flush_cmd);
896
+ qemuio_add_command(&zone_report_cmd);
897
+ qemuio_add_command(&zone_open_cmd);
898
+ qemuio_add_command(&zone_close_cmd);
899
+ qemuio_add_command(&zone_finish_cmd);
900
+ qemuio_add_command(&zone_reset_cmd);
901
qemuio_add_command(&truncate_cmd);
902
qemuio_add_command(&length_cmd);
903
qemuio_add_command(&info_cmd);
904
--
905
2.39.2
906
907
diff view generated by jsdifflib
New patch
1
From: Sam Li <faithilikerun@gmail.com>
1
2
3
raw-format driver usually sits on top of file-posix driver. It needs to
4
pass through requests of zone commands.
5
6
Signed-off-by: Sam Li <faithilikerun@gmail.com>
7
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
8
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
9
Reviewed-by: Hannes Reinecke <hare@suse.de>
10
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
11
Acked-by: Kevin Wolf <kwolf@redhat.com>
12
Message-id: 20230324090605.28361-5-faithilikerun@gmail.com
13
[Adjust commit message prefix as suggested by Philippe Mathieu-Daudé
14
<philmd@linaro.org>.
15
--Stefan]
16
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
17
---
18
block/raw-format.c | 17 +++++++++++++++++
19
1 file changed, 17 insertions(+)
20
21
diff --git a/block/raw-format.c b/block/raw-format.c
22
index XXXXXXX..XXXXXXX 100644
23
--- a/block/raw-format.c
24
+++ b/block/raw-format.c
25
@@ -XXX,XX +XXX,XX @@ raw_co_pdiscard(BlockDriverState *bs, int64_t offset, int64_t bytes)
26
return bdrv_co_pdiscard(bs->file, offset, bytes);
27
}
28
29
+static int coroutine_fn GRAPH_RDLOCK
30
+raw_co_zone_report(BlockDriverState *bs, int64_t offset,
31
+ unsigned int *nr_zones,
32
+ BlockZoneDescriptor *zones)
33
+{
34
+ return bdrv_co_zone_report(bs->file->bs, offset, nr_zones, zones);
35
+}
36
+
37
+static int coroutine_fn GRAPH_RDLOCK
38
+raw_co_zone_mgmt(BlockDriverState *bs, BlockZoneOp op,
39
+ int64_t offset, int64_t len)
40
+{
41
+ return bdrv_co_zone_mgmt(bs->file->bs, op, offset, len);
42
+}
43
+
44
static int64_t coroutine_fn GRAPH_RDLOCK
45
raw_co_getlength(BlockDriverState *bs)
46
{
47
@@ -XXX,XX +XXX,XX @@ BlockDriver bdrv_raw = {
48
.bdrv_co_pwritev = &raw_co_pwritev,
49
.bdrv_co_pwrite_zeroes = &raw_co_pwrite_zeroes,
50
.bdrv_co_pdiscard = &raw_co_pdiscard,
51
+ .bdrv_co_zone_report = &raw_co_zone_report,
52
+ .bdrv_co_zone_mgmt = &raw_co_zone_mgmt,
53
.bdrv_co_block_status = &raw_co_block_status,
54
.bdrv_co_copy_range_from = &raw_co_copy_range_from,
55
.bdrv_co_copy_range_to = &raw_co_copy_range_to,
56
--
57
2.39.2
58
59
diff view generated by jsdifflib
New patch
1
From: Sam Li <faithilikerun@gmail.com>
1
2
3
Putting zoned/non-zoned BlockDrivers on top of each other is not
4
allowed.
5
6
Signed-off-by: Sam Li <faithilikerun@gmail.com>
7
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
8
Reviewed-by: Hannes Reinecke <hare@suse.de>
9
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
10
Acked-by: Kevin Wolf <kwolf@redhat.com>
11
Message-id: 20230324090605.28361-6-faithilikerun@gmail.com
12
[Adjust commit message prefix as suggested by Philippe Mathieu-Daudé
13
<philmd@linaro.org> and clarify that the check is about zoned
14
BlockDrivers.
15
--Stefan]
16
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
17
---
18
include/block/block_int-common.h | 5 +++++
19
block.c | 19 +++++++++++++++++++
20
block/file-posix.c | 12 ++++++++++++
21
block/raw-format.c | 1 +
22
4 files changed, 37 insertions(+)
23
24
diff --git a/include/block/block_int-common.h b/include/block/block_int-common.h
25
index XXXXXXX..XXXXXXX 100644
26
--- a/include/block/block_int-common.h
27
+++ b/include/block/block_int-common.h
28
@@ -XXX,XX +XXX,XX @@ struct BlockDriver {
29
*/
30
bool is_format;
31
32
+ /*
33
+ * Set to true if the BlockDriver supports zoned children.
34
+ */
35
+ bool supports_zoned_children;
36
+
37
/*
38
* Drivers not implementing bdrv_parse_filename nor bdrv_open should have
39
* this field set to true, except ones that are defined only by their
40
diff --git a/block.c b/block.c
41
index XXXXXXX..XXXXXXX 100644
42
--- a/block.c
43
+++ b/block.c
44
@@ -XXX,XX +XXX,XX @@ void bdrv_add_child(BlockDriverState *parent_bs, BlockDriverState *child_bs,
45
return;
46
}
47
48
+ /*
49
+ * Non-zoned block drivers do not follow zoned storage constraints
50
+ * (i.e. sequential writes to zones). Refuse mixing zoned and non-zoned
51
+ * drivers in a graph.
52
+ */
53
+ if (!parent_bs->drv->supports_zoned_children &&
54
+ child_bs->bl.zoned == BLK_Z_HM) {
55
+ /*
56
+ * The host-aware model allows zoned storage constraints and random
57
+ * write. Allow mixing host-aware and non-zoned drivers. Using
58
+ * host-aware device as a regular device.
59
+ */
60
+ error_setg(errp, "Cannot add a %s child to a %s parent",
61
+ child_bs->bl.zoned == BLK_Z_HM ? "zoned" : "non-zoned",
62
+ parent_bs->drv->supports_zoned_children ?
63
+ "support zoned children" : "not support zoned children");
64
+ return;
65
+ }
66
+
67
if (!QLIST_EMPTY(&child_bs->parents)) {
68
error_setg(errp, "The node %s already has a parent",
69
child_bs->node_name);
70
diff --git a/block/file-posix.c b/block/file-posix.c
71
index XXXXXXX..XXXXXXX 100644
72
--- a/block/file-posix.c
73
+++ b/block/file-posix.c
74
@@ -XXX,XX +XXX,XX @@ static int raw_open_common(BlockDriverState *bs, QDict *options,
75
goto fail;
76
}
77
}
78
+#ifdef CONFIG_BLKZONED
79
+ /*
80
+ * The kernel page cache does not reliably work for writes to SWR zones
81
+ * of zoned block device because it can not guarantee the order of writes.
82
+ */
83
+ if ((bs->bl.zoned != BLK_Z_NONE) &&
84
+ (!(s->open_flags & O_DIRECT))) {
85
+ error_setg(errp, "The driver supports zoned devices, and it requires "
86
+ "cache.direct=on, which was not specified.");
87
+ return -EINVAL; /* No host kernel page cache */
88
+ }
89
+#endif
90
91
if (S_ISBLK(st.st_mode)) {
92
#ifdef __linux__
93
diff --git a/block/raw-format.c b/block/raw-format.c
94
index XXXXXXX..XXXXXXX 100644
95
--- a/block/raw-format.c
96
+++ b/block/raw-format.c
97
@@ -XXX,XX +XXX,XX @@ static void raw_child_perm(BlockDriverState *bs, BdrvChild *c,
98
BlockDriver bdrv_raw = {
99
.format_name = "raw",
100
.instance_size = sizeof(BDRVRawState),
101
+ .supports_zoned_children = true,
102
.bdrv_probe = &raw_probe,
103
.bdrv_reopen_prepare = &raw_reopen_prepare,
104
.bdrv_reopen_commit = &raw_reopen_commit,
105
--
106
2.39.2
107
108
diff view generated by jsdifflib
New patch
1
From: Sam Li <faithilikerun@gmail.com>
1
2
3
The new block layer APIs of zoned block devices can be tested by:
4
$ tests/qemu-iotests/check zoned
5
Run each zone operation on a newly created null_blk device
6
and see whether it outputs the same zone information.
7
8
Signed-off-by: Sam Li <faithilikerun@gmail.com>
9
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
10
Acked-by: Kevin Wolf <kwolf@redhat.com>
11
Message-id: 20230324090605.28361-7-faithilikerun@gmail.com
12
[Adjust commit message prefix as suggested by Philippe Mathieu-Daudé
13
<philmd@linaro.org>.
14
--Stefan]
15
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
16
---
17
tests/qemu-iotests/tests/zoned | 89 ++++++++++++++++++++++++++++++
18
tests/qemu-iotests/tests/zoned.out | 53 ++++++++++++++++++
19
2 files changed, 142 insertions(+)
20
create mode 100755 tests/qemu-iotests/tests/zoned
21
create mode 100644 tests/qemu-iotests/tests/zoned.out
22
23
diff --git a/tests/qemu-iotests/tests/zoned b/tests/qemu-iotests/tests/zoned
24
new file mode 100755
25
index XXXXXXX..XXXXXXX
26
--- /dev/null
27
+++ b/tests/qemu-iotests/tests/zoned
28
@@ -XXX,XX +XXX,XX @@
29
+#!/usr/bin/env bash
30
+#
31
+# Test zone management operations.
32
+#
33
+
34
+seq="$(basename $0)"
35
+echo "QA output created by $seq"
36
+status=1 # failure is the default!
37
+
38
+_cleanup()
39
+{
40
+ _cleanup_test_img
41
+ sudo -n rmmod null_blk
42
+}
43
+trap "_cleanup; exit \$status" 0 1 2 3 15
44
+
45
+# get standard environment, filters and checks
46
+. ../common.rc
47
+. ../common.filter
48
+. ../common.qemu
49
+
50
+# This test only runs on Linux hosts with raw image files.
51
+_supported_fmt raw
52
+_supported_proto file
53
+_supported_os Linux
54
+
55
+sudo -n true || \
56
+ _notrun 'Password-less sudo required'
57
+
58
+IMG="--image-opts -n driver=host_device,filename=/dev/nullb0"
59
+QEMU_IO_OPTIONS=$QEMU_IO_OPTIONS_NO_FMT
60
+
61
+echo "Testing a null_blk device:"
62
+echo "case 1: if the operations work"
63
+sudo -n modprobe null_blk nr_devices=1 zoned=1
64
+sudo -n chmod 0666 /dev/nullb0
65
+
66
+echo "(1) report the first zone:"
67
+$QEMU_IO $IMG -c "zrp 0 1"
68
+echo
69
+echo "report the first 10 zones"
70
+$QEMU_IO $IMG -c "zrp 0 10"
71
+echo
72
+echo "report the last zone:"
73
+$QEMU_IO $IMG -c "zrp 0x3e70000000 2" # 0x3e70000000 / 512 = 0x1f380000
74
+echo
75
+echo
76
+echo "(2) opening the first zone"
77
+$QEMU_IO $IMG -c "zo 0 268435456" # 268435456 / 512 = 524288
78
+echo "report after:"
79
+$QEMU_IO $IMG -c "zrp 0 1"
80
+echo
81
+echo "opening the second zone"
82
+$QEMU_IO $IMG -c "zo 268435456 268435456" #
83
+echo "report after:"
84
+$QEMU_IO $IMG -c "zrp 268435456 1"
85
+echo
86
+echo "opening the last zone"
87
+$QEMU_IO $IMG -c "zo 0x3e70000000 268435456"
88
+echo "report after:"
89
+$QEMU_IO $IMG -c "zrp 0x3e70000000 2"
90
+echo
91
+echo
92
+echo "(3) closing the first zone"
93
+$QEMU_IO $IMG -c "zc 0 268435456"
94
+echo "report after:"
95
+$QEMU_IO $IMG -c "zrp 0 1"
96
+echo
97
+echo "closing the last zone"
98
+$QEMU_IO $IMG -c "zc 0x3e70000000 268435456"
99
+echo "report after:"
100
+$QEMU_IO $IMG -c "zrp 0x3e70000000 2"
101
+echo
102
+echo
103
+echo "(4) finishing the second zone"
104
+$QEMU_IO $IMG -c "zf 268435456 268435456"
105
+echo "After finishing a zone:"
106
+$QEMU_IO $IMG -c "zrp 268435456 1"
107
+echo
108
+echo
109
+echo "(5) resetting the second zone"
110
+$QEMU_IO $IMG -c "zrs 268435456 268435456"
111
+echo "After resetting a zone:"
112
+$QEMU_IO $IMG -c "zrp 268435456 1"
113
+
114
+# success, all done
115
+echo "*** done"
116
+rm -f $seq.full
117
+status=0
118
diff --git a/tests/qemu-iotests/tests/zoned.out b/tests/qemu-iotests/tests/zoned.out
119
new file mode 100644
120
index XXXXXXX..XXXXXXX
121
--- /dev/null
122
+++ b/tests/qemu-iotests/tests/zoned.out
123
@@ -XXX,XX +XXX,XX @@
124
+QA output created by zoned
125
+Testing a null_blk device:
126
+case 1: if the operations work
127
+(1) report the first zone:
128
+start: 0x0, len 0x80000, cap 0x80000, wptr 0x0, zcond:1, [type: 2]
129
+
130
+report the first 10 zones
131
+start: 0x0, len 0x80000, cap 0x80000, wptr 0x0, zcond:1, [type: 2]
132
+start: 0x80000, len 0x80000, cap 0x80000, wptr 0x80000, zcond:1, [type: 2]
133
+start: 0x100000, len 0x80000, cap 0x80000, wptr 0x100000, zcond:1, [type: 2]
134
+start: 0x180000, len 0x80000, cap 0x80000, wptr 0x180000, zcond:1, [type: 2]
135
+start: 0x200000, len 0x80000, cap 0x80000, wptr 0x200000, zcond:1, [type: 2]
136
+start: 0x280000, len 0x80000, cap 0x80000, wptr 0x280000, zcond:1, [type: 2]
137
+start: 0x300000, len 0x80000, cap 0x80000, wptr 0x300000, zcond:1, [type: 2]
138
+start: 0x380000, len 0x80000, cap 0x80000, wptr 0x380000, zcond:1, [type: 2]
139
+start: 0x400000, len 0x80000, cap 0x80000, wptr 0x400000, zcond:1, [type: 2]
140
+start: 0x480000, len 0x80000, cap 0x80000, wptr 0x480000, zcond:1, [type: 2]
141
+
142
+report the last zone:
143
+start: 0x1f380000, len 0x80000, cap 0x80000, wptr 0x1f380000, zcond:1, [type: 2]
144
+
145
+
146
+(2) opening the first zone
147
+report after:
148
+start: 0x0, len 0x80000, cap 0x80000, wptr 0x0, zcond:3, [type: 2]
149
+
150
+opening the second zone
151
+report after:
152
+start: 0x80000, len 0x80000, cap 0x80000, wptr 0x80000, zcond:3, [type: 2]
153
+
154
+opening the last zone
155
+report after:
156
+start: 0x1f380000, len 0x80000, cap 0x80000, wptr 0x1f380000, zcond:3, [type: 2]
157
+
158
+
159
+(3) closing the first zone
160
+report after:
161
+start: 0x0, len 0x80000, cap 0x80000, wptr 0x0, zcond:1, [type: 2]
162
+
163
+closing the last zone
164
+report after:
165
+start: 0x1f380000, len 0x80000, cap 0x80000, wptr 0x1f380000, zcond:1, [type: 2]
166
+
167
+
168
+(4) finishing the second zone
169
+After finishing a zone:
170
+start: 0x80000, len 0x80000, cap 0x80000, wptr 0x100000, zcond:14, [type: 2]
171
+
172
+
173
+(5) resetting the second zone
174
+After resetting a zone:
175
+start: 0x80000, len 0x80000, cap 0x80000, wptr 0x80000, zcond:1, [type: 2]
176
+*** done
177
--
178
2.39.2
179
180
diff view generated by jsdifflib
New patch
1
From: Sam Li <faithilikerun@gmail.com>
1
2
3
Signed-off-by: Sam Li <faithilikerun@gmail.com>
4
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
5
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
6
Acked-by: Kevin Wolf <kwolf@redhat.com>
7
Message-id: 20230324090605.28361-8-faithilikerun@gmail.com
8
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
9
---
10
block/file-posix.c | 3 +++
11
block/trace-events | 2 ++
12
2 files changed, 5 insertions(+)
13
14
diff --git a/block/file-posix.c b/block/file-posix.c
15
index XXXXXXX..XXXXXXX 100644
16
--- a/block/file-posix.c
17
+++ b/block/file-posix.c
18
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn raw_co_zone_report(BlockDriverState *bs, int64_t offset,
19
},
20
};
21
22
+ trace_zbd_zone_report(bs, *nr_zones, offset >> BDRV_SECTOR_BITS);
23
return raw_thread_pool_submit(bs, handle_aiocb_zone_report, &acb);
24
}
25
#endif
26
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn raw_co_zone_mgmt(BlockDriverState *bs, BlockZoneOp op,
27
},
28
};
29
30
+ trace_zbd_zone_mgmt(bs, op_name, offset >> BDRV_SECTOR_BITS,
31
+ len >> BDRV_SECTOR_BITS);
32
ret = raw_thread_pool_submit(bs, handle_aiocb_zone_mgmt, &acb);
33
if (ret != 0) {
34
error_report("ioctl %s failed %d", op_name, ret);
35
diff --git a/block/trace-events b/block/trace-events
36
index XXXXXXX..XXXXXXX 100644
37
--- a/block/trace-events
38
+++ b/block/trace-events
39
@@ -XXX,XX +XXX,XX @@ file_FindEjectableOpticalMedia(const char *media) "Matching using %s"
40
file_setup_cdrom(const char *partition) "Using %s as optical disc"
41
file_hdev_is_sg(int type, int version) "SG device found: type=%d, version=%d"
42
file_flush_fdatasync_failed(int err) "errno %d"
43
+zbd_zone_report(void *bs, unsigned int nr_zones, int64_t sector) "bs %p report %d zones starting at sector offset 0x%" PRIx64 ""
44
+zbd_zone_mgmt(void *bs, const char *op_name, int64_t sector, int64_t len) "bs %p %s starts at sector offset 0x%" PRIx64 " over a range of 0x%" PRIx64 " sectors"
45
46
# ssh.c
47
sftp_error(const char *op, const char *ssh_err, int ssh_err_code, int sftp_err_code) "%s failed: %s (libssh error code: %d, sftp error code: %d)"
48
--
49
2.39.2
diff view generated by jsdifflib
New patch
1
From: Sam Li <faithilikerun@gmail.com>
1
2
3
Add the documentation about the zoned device support to virtio-blk
4
emulation.
5
6
Signed-off-by: Sam Li <faithilikerun@gmail.com>
7
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
8
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
9
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
10
Acked-by: Kevin Wolf <kwolf@redhat.com>
11
Message-id: 20230324090605.28361-9-faithilikerun@gmail.com
12
[Add index-api.rst to fix "zoned-storage.rst:document isn't included in
13
any toctree" error.
14
--Stefan]
15
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
16
---
17
docs/devel/index-api.rst | 1 +
18
docs/devel/zoned-storage.rst | 43 ++++++++++++++++++++++++++
19
docs/system/qemu-block-drivers.rst.inc | 6 ++++
20
3 files changed, 50 insertions(+)
21
create mode 100644 docs/devel/zoned-storage.rst
22
23
diff --git a/docs/devel/index-api.rst b/docs/devel/index-api.rst
24
index XXXXXXX..XXXXXXX 100644
25
--- a/docs/devel/index-api.rst
26
+++ b/docs/devel/index-api.rst
27
@@ -XXX,XX +XXX,XX @@ generated from in-code annotations to function prototypes.
28
memory
29
modules
30
ui
31
+ zoned-storage
32
diff --git a/docs/devel/zoned-storage.rst b/docs/devel/zoned-storage.rst
33
new file mode 100644
34
index XXXXXXX..XXXXXXX
35
--- /dev/null
36
+++ b/docs/devel/zoned-storage.rst
37
@@ -XXX,XX +XXX,XX @@
38
+=============
39
+zoned-storage
40
+=============
41
+
42
+Zoned Block Devices (ZBDs) divide the LBA space into block regions called zones
43
+that are larger than the LBA size. They can only allow sequential writes, which
44
+can reduce write amplification in SSDs, and potentially lead to higher
45
+throughput and increased capacity. More details about ZBDs can be found at:
46
+
47
+https://zonedstorage.io/docs/introduction/zoned-storage
48
+
49
+1. Block layer APIs for zoned storage
50
+-------------------------------------
51
+QEMU block layer supports three zoned storage models:
52
+- BLK_Z_HM: The host-managed zoned model only allows sequential writes access
53
+to zones. It supports ZBD-specific I/O commands that can be used by a host to
54
+manage the zones of a device.
55
+- BLK_Z_HA: The host-aware zoned model allows random write operations in
56
+zones, making it backward compatible with regular block devices.
57
+- BLK_Z_NONE: The non-zoned model has no zones support. It includes both
58
+regular and drive-managed ZBD devices. ZBD-specific I/O commands are not
59
+supported.
60
+
61
+The block device information resides inside BlockDriverState. QEMU uses
62
+BlockLimits struct(BlockDriverState::bl) that is continuously accessed by the
63
+block layer while processing I/O requests. A BlockBackend has a root pointer to
64
+a BlockDriverState graph(for example, raw format on top of file-posix). The
65
+zoned storage information can be propagated from the leaf BlockDriverState all
66
+the way up to the BlockBackend. If the zoned storage model in file-posix is
67
+set to BLK_Z_HM, then block drivers will declare support for zoned host device.
68
+
69
+The block layer APIs support commands needed for zoned storage devices,
70
+including report zones, four zone operations, and zone append.
71
+
72
+2. Emulating zoned storage controllers
73
+--------------------------------------
74
+When the BlockBackend's BlockLimits model reports a zoned storage device, users
75
+like the virtio-blk emulation or the qemu-io-cmds.c utility can use block layer
76
+APIs for zoned storage emulation or testing.
77
+
78
+For example, to test zone_report on a null_blk device using qemu-io is:
79
+$ path/to/qemu-io --image-opts -n driver=host_device,filename=/dev/nullb0
80
+-c "zrp offset nr_zones"
81
diff --git a/docs/system/qemu-block-drivers.rst.inc b/docs/system/qemu-block-drivers.rst.inc
82
index XXXXXXX..XXXXXXX 100644
83
--- a/docs/system/qemu-block-drivers.rst.inc
84
+++ b/docs/system/qemu-block-drivers.rst.inc
85
@@ -XXX,XX +XXX,XX @@ Hard disks
86
you may corrupt your host data (use the ``-snapshot`` command
87
line option or modify the device permissions accordingly).
88
89
+Zoned block devices
90
+ Zoned block devices can be passed through to the guest if the emulated storage
91
+ controller supports zoned storage. Use ``--blockdev host_device,
92
+ node-name=drive0,filename=/dev/nullb0,cache.direct=on`` to pass through
93
+ ``/dev/nullb0`` as ``drive0``.
94
+
95
Windows
96
^^^^^^^
97
98
--
99
2.39.2
diff view generated by jsdifflib
New patch
1
From: Philippe Mathieu-Daudé <philmd@linaro.org>
1
2
3
Introduce the BdrvDmgUncompressFunc type defintion. To emphasis
4
dmg_uncompress_bz2 and dmg_uncompress_lzfse are pointer to functions,
5
declare them using this new typedef.
6
7
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
8
Message-id: 20230320152610.32052-1-philmd@linaro.org
9
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10
---
11
block/dmg.h | 8 ++++----
12
block/dmg.c | 7 ++-----
13
2 files changed, 6 insertions(+), 9 deletions(-)
14
15
diff --git a/block/dmg.h b/block/dmg.h
16
index XXXXXXX..XXXXXXX 100644
17
--- a/block/dmg.h
18
+++ b/block/dmg.h
19
@@ -XXX,XX +XXX,XX @@ typedef struct BDRVDMGState {
20
z_stream zstream;
21
} BDRVDMGState;
22
23
-extern int (*dmg_uncompress_bz2)(char *next_in, unsigned int avail_in,
24
- char *next_out, unsigned int avail_out);
25
+typedef int BdrvDmgUncompressFunc(char *next_in, unsigned int avail_in,
26
+ char *next_out, unsigned int avail_out);
27
28
-extern int (*dmg_uncompress_lzfse)(char *next_in, unsigned int avail_in,
29
- char *next_out, unsigned int avail_out);
30
+extern BdrvDmgUncompressFunc *dmg_uncompress_bz2;
31
+extern BdrvDmgUncompressFunc *dmg_uncompress_lzfse;
32
33
#endif
34
diff --git a/block/dmg.c b/block/dmg.c
35
index XXXXXXX..XXXXXXX 100644
36
--- a/block/dmg.c
37
+++ b/block/dmg.c
38
@@ -XXX,XX +XXX,XX @@
39
#include "qemu/memalign.h"
40
#include "dmg.h"
41
42
-int (*dmg_uncompress_bz2)(char *next_in, unsigned int avail_in,
43
- char *next_out, unsigned int avail_out);
44
-
45
-int (*dmg_uncompress_lzfse)(char *next_in, unsigned int avail_in,
46
- char *next_out, unsigned int avail_out);
47
+BdrvDmgUncompressFunc *dmg_uncompress_bz2;
48
+BdrvDmgUncompressFunc *dmg_uncompress_lzfse;
49
50
enum {
51
/* Limit chunk sizes to prevent unreasonable amounts of memory being used
52
--
53
2.39.2
54
55
diff view generated by jsdifflib
New patch
1
From: Thomas De Schampheleire <thomas.de_schampheleire@nokia.com>
1
2
3
The event filename is an absolute path. Convert it to a relative path when
4
writing '#line' directives, to preserve reproducibility of the generated
5
output when different base paths are used.
6
7
Signed-off-by: Thomas De Schampheleire <thomas.de_schampheleire@nokia.com>
8
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
9
Message-Id: <20230406080045.21696-1-thomas.de_schampheleire@nokia.com>
10
---
11
scripts/tracetool/backend/ftrace.py | 4 +++-
12
scripts/tracetool/backend/log.py | 4 +++-
13
scripts/tracetool/backend/syslog.py | 4 +++-
14
3 files changed, 9 insertions(+), 3 deletions(-)
15
16
diff --git a/scripts/tracetool/backend/ftrace.py b/scripts/tracetool/backend/ftrace.py
17
index XXXXXXX..XXXXXXX 100644
18
--- a/scripts/tracetool/backend/ftrace.py
19
+++ b/scripts/tracetool/backend/ftrace.py
20
@@ -XXX,XX +XXX,XX @@
21
__email__ = "stefanha@redhat.com"
22
23
24
+import os.path
25
+
26
from tracetool import out
27
28
29
@@ -XXX,XX +XXX,XX @@ def generate_h(event, group):
30
args=event.args,
31
event_id="TRACE_" + event.name.upper(),
32
event_lineno=event.lineno,
33
- event_filename=event.filename,
34
+ event_filename=os.path.relpath(event.filename),
35
fmt=event.fmt.rstrip("\n"),
36
argnames=argnames)
37
38
diff --git a/scripts/tracetool/backend/log.py b/scripts/tracetool/backend/log.py
39
index XXXXXXX..XXXXXXX 100644
40
--- a/scripts/tracetool/backend/log.py
41
+++ b/scripts/tracetool/backend/log.py
42
@@ -XXX,XX +XXX,XX @@
43
__email__ = "stefanha@redhat.com"
44
45
46
+import os.path
47
+
48
from tracetool import out
49
50
51
@@ -XXX,XX +XXX,XX @@ def generate_h(event, group):
52
' }',
53
cond=cond,
54
event_lineno=event.lineno,
55
- event_filename=event.filename,
56
+ event_filename=os.path.relpath(event.filename),
57
name=event.name,
58
fmt=event.fmt.rstrip("\n"),
59
argnames=argnames)
60
diff --git a/scripts/tracetool/backend/syslog.py b/scripts/tracetool/backend/syslog.py
61
index XXXXXXX..XXXXXXX 100644
62
--- a/scripts/tracetool/backend/syslog.py
63
+++ b/scripts/tracetool/backend/syslog.py
64
@@ -XXX,XX +XXX,XX @@
65
__email__ = "stefanha@redhat.com"
66
67
68
+import os.path
69
+
70
from tracetool import out
71
72
73
@@ -XXX,XX +XXX,XX @@ def generate_h(event, group):
74
' }',
75
cond=cond,
76
event_lineno=event.lineno,
77
- event_filename=event.filename,
78
+ event_filename=os.path.relpath(event.filename),
79
name=event.name,
80
fmt=event.fmt.rstrip("\n"),
81
argnames=argnames)
82
--
83
2.39.2
diff view generated by jsdifflib
New patch
1
From: Sam Li <faithilikerun@gmail.com>
1
2
3
Since Linux doesn't have a user API to issue zone append operations to
4
zoned devices from user space, the file-posix driver is modified to add
5
zone append emulation using regular writes. To do this, the file-posix
6
driver tracks the wp location of all zones of the device. It uses an
7
array of uint64_t. The most significant bit of each wp location indicates
8
if the zone type is conventional zones.
9
10
The zones wp can be changed due to the following operations issued:
11
- zone reset: change the wp to the start offset of that zone
12
- zone finish: change to the end location of that zone
13
- write to a zone
14
- zone append
15
16
Signed-off-by: Sam Li <faithilikerun@gmail.com>
17
Message-id: 20230407081657.17947-2-faithilikerun@gmail.com
18
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
19
---
20
include/block/block-common.h | 14 +++
21
include/block/block_int-common.h | 5 +
22
block/file-posix.c | 173 ++++++++++++++++++++++++++++++-
23
3 files changed, 189 insertions(+), 3 deletions(-)
24
25
diff --git a/include/block/block-common.h b/include/block/block-common.h
26
index XXXXXXX..XXXXXXX 100644
27
--- a/include/block/block-common.h
28
+++ b/include/block/block-common.h
29
@@ -XXX,XX +XXX,XX @@ typedef struct BlockZoneDescriptor {
30
BlockZoneState state;
31
} BlockZoneDescriptor;
32
33
+/*
34
+ * Track write pointers of a zone in bytes.
35
+ */
36
+typedef struct BlockZoneWps {
37
+ CoMutex colock;
38
+ uint64_t wp[];
39
+} BlockZoneWps;
40
+
41
typedef struct BlockDriverInfo {
42
/* in bytes, 0 if irrelevant */
43
int cluster_size;
44
@@ -XXX,XX +XXX,XX @@ typedef enum {
45
#define BDRV_SECTOR_BITS 9
46
#define BDRV_SECTOR_SIZE (1ULL << BDRV_SECTOR_BITS)
47
48
+/*
49
+ * Get the first most significant bit of wp. If it is zero, then
50
+ * the zone type is SWR.
51
+ */
52
+#define BDRV_ZT_IS_CONV(wp) (wp & (1ULL << 63))
53
+
54
#define BDRV_REQUEST_MAX_SECTORS MIN_CONST(SIZE_MAX >> BDRV_SECTOR_BITS, \
55
INT_MAX >> BDRV_SECTOR_BITS)
56
#define BDRV_REQUEST_MAX_BYTES (BDRV_REQUEST_MAX_SECTORS << BDRV_SECTOR_BITS)
57
diff --git a/include/block/block_int-common.h b/include/block/block_int-common.h
58
index XXXXXXX..XXXXXXX 100644
59
--- a/include/block/block_int-common.h
60
+++ b/include/block/block_int-common.h
61
@@ -XXX,XX +XXX,XX @@ typedef struct BlockLimits {
62
63
/* maximum number of active zones */
64
int64_t max_active_zones;
65
+
66
+ int64_t write_granularity;
67
} BlockLimits;
68
69
typedef struct BdrvOpBlocker BdrvOpBlocker;
70
@@ -XXX,XX +XXX,XX @@ struct BlockDriverState {
71
CoMutex bsc_modify_lock;
72
/* Always non-NULL, but must only be dereferenced under an RCU read guard */
73
BdrvBlockStatusCache *block_status_cache;
74
+
75
+ /* array of write pointers' location of each zone in the zoned device. */
76
+ BlockZoneWps *wps;
77
};
78
79
struct BlockBackendRootState {
80
diff --git a/block/file-posix.c b/block/file-posix.c
81
index XXXXXXX..XXXXXXX 100644
82
--- a/block/file-posix.c
83
+++ b/block/file-posix.c
84
@@ -XXX,XX +XXX,XX @@ static int hdev_get_max_segments(int fd, struct stat *st)
85
#endif
86
}
87
88
+#if defined(CONFIG_BLKZONED)
89
+/*
90
+ * If the reset_all flag is true, then the wps of zone whose state is
91
+ * not readonly or offline should be all reset to the start sector.
92
+ * Else, take the real wp of the device.
93
+ */
94
+static int get_zones_wp(BlockDriverState *bs, int fd, int64_t offset,
95
+ unsigned int nrz, bool reset_all)
96
+{
97
+ struct blk_zone *blkz;
98
+ size_t rep_size;
99
+ uint64_t sector = offset >> BDRV_SECTOR_BITS;
100
+ BlockZoneWps *wps = bs->wps;
101
+ int j = offset / bs->bl.zone_size;
102
+ int ret, n = 0, i = 0;
103
+ rep_size = sizeof(struct blk_zone_report) + nrz * sizeof(struct blk_zone);
104
+ g_autofree struct blk_zone_report *rep = NULL;
105
+
106
+ rep = g_malloc(rep_size);
107
+ blkz = (struct blk_zone *)(rep + 1);
108
+ while (n < nrz) {
109
+ memset(rep, 0, rep_size);
110
+ rep->sector = sector;
111
+ rep->nr_zones = nrz - n;
112
+
113
+ do {
114
+ ret = ioctl(fd, BLKREPORTZONE, rep);
115
+ } while (ret != 0 && errno == EINTR);
116
+ if (ret != 0) {
117
+ error_report("%d: ioctl BLKREPORTZONE at %" PRId64 " failed %d",
118
+ fd, offset, errno);
119
+ return -errno;
120
+ }
121
+
122
+ if (!rep->nr_zones) {
123
+ break;
124
+ }
125
+
126
+ for (i = 0; i < rep->nr_zones; ++i, ++n, ++j) {
127
+ /*
128
+ * The wp tracking cares only about sequential writes required and
129
+ * sequential write preferred zones so that the wp can advance to
130
+ * the right location.
131
+ * Use the most significant bit of the wp location to indicate the
132
+ * zone type: 0 for SWR/SWP zones and 1 for conventional zones.
133
+ */
134
+ if (blkz[i].type == BLK_ZONE_TYPE_CONVENTIONAL) {
135
+ wps->wp[j] |= 1ULL << 63;
136
+ } else {
137
+ switch(blkz[i].cond) {
138
+ case BLK_ZONE_COND_FULL:
139
+ case BLK_ZONE_COND_READONLY:
140
+ /* Zone not writable */
141
+ wps->wp[j] = (blkz[i].start + blkz[i].len) << BDRV_SECTOR_BITS;
142
+ break;
143
+ case BLK_ZONE_COND_OFFLINE:
144
+ /* Zone not writable nor readable */
145
+ wps->wp[j] = (blkz[i].start) << BDRV_SECTOR_BITS;
146
+ break;
147
+ default:
148
+ if (reset_all) {
149
+ wps->wp[j] = blkz[i].start << BDRV_SECTOR_BITS;
150
+ } else {
151
+ wps->wp[j] = blkz[i].wp << BDRV_SECTOR_BITS;
152
+ }
153
+ break;
154
+ }
155
+ }
156
+ }
157
+ sector = blkz[i - 1].start + blkz[i - 1].len;
158
+ }
159
+
160
+ return 0;
161
+}
162
+
163
+static void update_zones_wp(BlockDriverState *bs, int fd, int64_t offset,
164
+ unsigned int nrz)
165
+{
166
+ if (get_zones_wp(bs, fd, offset, nrz, 0) < 0) {
167
+ error_report("update zone wp failed");
168
+ }
169
+}
170
+#endif
171
+
172
static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
173
{
174
BDRVRawState *s = bs->opaque;
175
@@ -XXX,XX +XXX,XX @@ static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
176
if (ret >= 0) {
177
bs->bl.max_active_zones = ret;
178
}
179
+
180
+ ret = get_sysfs_long_val(&st, "physical_block_size");
181
+ if (ret >= 0) {
182
+ bs->bl.write_granularity = ret;
183
+ }
184
+
185
+ /* The refresh_limits() function can be called multiple times. */
186
+ g_free(bs->wps);
187
+ bs->wps = g_malloc(sizeof(BlockZoneWps) +
188
+ sizeof(int64_t) * bs->bl.nr_zones);
189
+ ret = get_zones_wp(bs, s->fd, 0, bs->bl.nr_zones, 0);
190
+ if (ret < 0) {
191
+ error_setg_errno(errp, -ret, "report wps failed");
192
+ bs->wps = NULL;
193
+ return;
194
+ }
195
+ qemu_co_mutex_init(&bs->wps->colock);
196
return;
197
}
198
out:
199
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn raw_co_prw(BlockDriverState *bs, uint64_t offset,
200
{
201
BDRVRawState *s = bs->opaque;
202
RawPosixAIOData acb;
203
+ int ret;
204
205
if (fd_open(bs) < 0)
206
return -EIO;
207
+#if defined(CONFIG_BLKZONED)
208
+ if (type & QEMU_AIO_WRITE && bs->wps) {
209
+ qemu_co_mutex_lock(&bs->wps->colock);
210
+ }
211
+#endif
212
213
/*
214
* When using O_DIRECT, the request must be aligned to be able to use
215
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn raw_co_prw(BlockDriverState *bs, uint64_t offset,
216
} else if (s->use_linux_io_uring) {
217
LuringState *aio = aio_get_linux_io_uring(bdrv_get_aio_context(bs));
218
assert(qiov->size == bytes);
219
- return luring_co_submit(bs, aio, s->fd, offset, qiov, type);
220
+ ret = luring_co_submit(bs, aio, s->fd, offset, qiov, type);
221
+ goto out;
222
#endif
223
#ifdef CONFIG_LINUX_AIO
224
} else if (s->use_linux_aio) {
225
LinuxAioState *aio = aio_get_linux_aio(bdrv_get_aio_context(bs));
226
assert(qiov->size == bytes);
227
- return laio_co_submit(bs, aio, s->fd, offset, qiov, type,
228
+ ret = laio_co_submit(bs, aio, s->fd, offset, qiov, type,
229
s->aio_max_batch);
230
+ goto out;
231
#endif
232
}
233
234
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn raw_co_prw(BlockDriverState *bs, uint64_t offset,
235
};
236
237
assert(qiov->size == bytes);
238
- return raw_thread_pool_submit(bs, handle_aiocb_rw, &acb);
239
+ ret = raw_thread_pool_submit(bs, handle_aiocb_rw, &acb);
240
+
241
+out:
242
+#if defined(CONFIG_BLKZONED)
243
+ BlockZoneWps *wps = bs->wps;
244
+ if (ret == 0) {
245
+ if (type & QEMU_AIO_WRITE && wps && bs->bl.zone_size) {
246
+ uint64_t *wp = &wps->wp[offset / bs->bl.zone_size];
247
+ if (!BDRV_ZT_IS_CONV(*wp)) {
248
+ /* Advance the wp if needed */
249
+ if (offset + bytes > *wp) {
250
+ *wp = offset + bytes;
251
+ }
252
+ }
253
+ }
254
+ } else {
255
+ if (type & QEMU_AIO_WRITE) {
256
+ update_zones_wp(bs, s->fd, 0, 1);
257
+ }
258
+ }
259
+
260
+ if (type & QEMU_AIO_WRITE && wps) {
261
+ qemu_co_mutex_unlock(&wps->colock);
262
+ }
263
+#endif
264
+ return ret;
265
}
266
267
static int coroutine_fn raw_co_preadv(BlockDriverState *bs, int64_t offset,
268
@@ -XXX,XX +XXX,XX @@ static void raw_close(BlockDriverState *bs)
269
BDRVRawState *s = bs->opaque;
270
271
if (s->fd >= 0) {
272
+#if defined(CONFIG_BLKZONED)
273
+ g_free(bs->wps);
274
+#endif
275
qemu_close(s->fd);
276
s->fd = -1;
277
}
278
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn raw_co_zone_mgmt(BlockDriverState *bs, BlockZoneOp op,
279
const char *op_name;
280
unsigned long zo;
281
int ret;
282
+ BlockZoneWps *wps = bs->wps;
283
int64_t capacity = bs->total_sectors << BDRV_SECTOR_BITS;
284
285
zone_size = bs->bl.zone_size;
286
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn raw_co_zone_mgmt(BlockDriverState *bs, BlockZoneOp op,
287
return -EINVAL;
288
}
289
290
+ QEMU_LOCK_GUARD(&wps->colock);
291
+ uint32_t i = offset / bs->bl.zone_size;
292
+ uint32_t nrz = len / bs->bl.zone_size;
293
+ uint64_t *wp = &wps->wp[i];
294
+ if (BDRV_ZT_IS_CONV(*wp) && len != capacity) {
295
+ error_report("zone mgmt operations are not allowed for conventional zones");
296
+ return -EIO;
297
+ }
298
+
299
switch (op) {
300
case BLK_ZO_OPEN:
301
op_name = "BLKOPENZONE";
302
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn raw_co_zone_mgmt(BlockDriverState *bs, BlockZoneOp op,
303
len >> BDRV_SECTOR_BITS);
304
ret = raw_thread_pool_submit(bs, handle_aiocb_zone_mgmt, &acb);
305
if (ret != 0) {
306
+ update_zones_wp(bs, s->fd, offset, i);
307
error_report("ioctl %s failed %d", op_name, ret);
308
+ return ret;
309
+ }
310
+
311
+ if (zo == BLKRESETZONE && len == capacity) {
312
+ ret = get_zones_wp(bs, s->fd, 0, bs->bl.nr_zones, 1);
313
+ if (ret < 0) {
314
+ error_report("reporting single wp failed");
315
+ return ret;
316
+ }
317
+ } else if (zo == BLKRESETZONE) {
318
+ for (int j = 0; j < nrz; ++j) {
319
+ wp[j] = offset + j * zone_size;
320
+ }
321
+ } else if (zo == BLKFINISHZONE) {
322
+ for (int j = 0; j < nrz; ++j) {
323
+ /* The zoned device allows the last zone smaller that the
324
+ * zone size. */
325
+ wp[j] = MIN(offset + (j + 1) * zone_size, offset + len);
326
+ }
327
}
328
329
return ret;
330
--
331
2.39.2
diff view generated by jsdifflib
New patch
1
From: Sam Li <faithilikerun@gmail.com>
1
2
3
A zone append command is a write operation that specifies the first
4
logical block of a zone as the write position. When writing to a zoned
5
block device using zone append, the byte offset of the call may point at
6
any position within the zone to which the data is being appended. Upon
7
completion the device will respond with the position where the data has
8
been written in the zone.
9
10
Signed-off-by: Sam Li <faithilikerun@gmail.com>
11
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
12
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
13
Message-id: 20230407081657.17947-3-faithilikerun@gmail.com
14
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
15
---
16
include/block/block-io.h | 4 +++
17
include/block/block_int-common.h | 3 ++
18
include/block/raw-aio.h | 4 ++-
19
include/sysemu/block-backend-io.h | 9 +++++
20
block/block-backend.c | 60 +++++++++++++++++++++++++++++++
21
block/file-posix.c | 58 ++++++++++++++++++++++++++----
22
block/io.c | 27 ++++++++++++++
23
block/io_uring.c | 4 +++
24
block/linux-aio.c | 3 ++
25
block/raw-format.c | 8 +++++
26
10 files changed, 172 insertions(+), 8 deletions(-)
27
28
diff --git a/include/block/block-io.h b/include/block/block-io.h
29
index XXXXXXX..XXXXXXX 100644
30
--- a/include/block/block-io.h
31
+++ b/include/block/block-io.h
32
@@ -XXX,XX +XXX,XX @@ int coroutine_fn GRAPH_RDLOCK bdrv_co_zone_report(BlockDriverState *bs,
33
int coroutine_fn GRAPH_RDLOCK bdrv_co_zone_mgmt(BlockDriverState *bs,
34
BlockZoneOp op,
35
int64_t offset, int64_t len);
36
+int coroutine_fn GRAPH_RDLOCK bdrv_co_zone_append(BlockDriverState *bs,
37
+ int64_t *offset,
38
+ QEMUIOVector *qiov,
39
+ BdrvRequestFlags flags);
40
41
bool bdrv_can_write_zeroes_with_unmap(BlockDriverState *bs);
42
int bdrv_block_status(BlockDriverState *bs, int64_t offset,
43
diff --git a/include/block/block_int-common.h b/include/block/block_int-common.h
44
index XXXXXXX..XXXXXXX 100644
45
--- a/include/block/block_int-common.h
46
+++ b/include/block/block_int-common.h
47
@@ -XXX,XX +XXX,XX @@ struct BlockDriver {
48
BlockZoneDescriptor *zones);
49
int coroutine_fn (*bdrv_co_zone_mgmt)(BlockDriverState *bs, BlockZoneOp op,
50
int64_t offset, int64_t len);
51
+ int coroutine_fn (*bdrv_co_zone_append)(BlockDriverState *bs,
52
+ int64_t *offset, QEMUIOVector *qiov,
53
+ BdrvRequestFlags flags);
54
55
/* removable device specific */
56
bool coroutine_fn GRAPH_RDLOCK_PTR (*bdrv_co_is_inserted)(
57
diff --git a/include/block/raw-aio.h b/include/block/raw-aio.h
58
index XXXXXXX..XXXXXXX 100644
59
--- a/include/block/raw-aio.h
60
+++ b/include/block/raw-aio.h
61
@@ -XXX,XX +XXX,XX @@
62
#define QEMU_AIO_TRUNCATE 0x0080
63
#define QEMU_AIO_ZONE_REPORT 0x0100
64
#define QEMU_AIO_ZONE_MGMT 0x0200
65
+#define QEMU_AIO_ZONE_APPEND 0x0400
66
#define QEMU_AIO_TYPE_MASK \
67
(QEMU_AIO_READ | \
68
QEMU_AIO_WRITE | \
69
@@ -XXX,XX +XXX,XX @@
70
QEMU_AIO_COPY_RANGE | \
71
QEMU_AIO_TRUNCATE | \
72
QEMU_AIO_ZONE_REPORT | \
73
- QEMU_AIO_ZONE_MGMT)
74
+ QEMU_AIO_ZONE_MGMT | \
75
+ QEMU_AIO_ZONE_APPEND)
76
77
/* AIO flags */
78
#define QEMU_AIO_MISALIGNED 0x1000
79
diff --git a/include/sysemu/block-backend-io.h b/include/sysemu/block-backend-io.h
80
index XXXXXXX..XXXXXXX 100644
81
--- a/include/sysemu/block-backend-io.h
82
+++ b/include/sysemu/block-backend-io.h
83
@@ -XXX,XX +XXX,XX @@ BlockAIOCB *blk_aio_zone_report(BlockBackend *blk, int64_t offset,
84
BlockAIOCB *blk_aio_zone_mgmt(BlockBackend *blk, BlockZoneOp op,
85
int64_t offset, int64_t len,
86
BlockCompletionFunc *cb, void *opaque);
87
+BlockAIOCB *blk_aio_zone_append(BlockBackend *blk, int64_t *offset,
88
+ QEMUIOVector *qiov, BdrvRequestFlags flags,
89
+ BlockCompletionFunc *cb, void *opaque);
90
BlockAIOCB *blk_aio_pdiscard(BlockBackend *blk, int64_t offset, int64_t bytes,
91
BlockCompletionFunc *cb, void *opaque);
92
void blk_aio_cancel_async(BlockAIOCB *acb);
93
@@ -XXX,XX +XXX,XX @@ int coroutine_fn blk_co_zone_mgmt(BlockBackend *blk, BlockZoneOp op,
94
int64_t offset, int64_t len);
95
int co_wrapper_mixed blk_zone_mgmt(BlockBackend *blk, BlockZoneOp op,
96
int64_t offset, int64_t len);
97
+int coroutine_fn blk_co_zone_append(BlockBackend *blk, int64_t *offset,
98
+ QEMUIOVector *qiov,
99
+ BdrvRequestFlags flags);
100
+int co_wrapper_mixed blk_zone_append(BlockBackend *blk, int64_t *offset,
101
+ QEMUIOVector *qiov,
102
+ BdrvRequestFlags flags);
103
104
int co_wrapper_mixed blk_pdiscard(BlockBackend *blk, int64_t offset,
105
int64_t bytes);
106
diff --git a/block/block-backend.c b/block/block-backend.c
107
index XXXXXXX..XXXXXXX 100644
108
--- a/block/block-backend.c
109
+++ b/block/block-backend.c
110
@@ -XXX,XX +XXX,XX @@ BlockAIOCB *blk_aio_zone_mgmt(BlockBackend *blk, BlockZoneOp op,
111
return &acb->common;
112
}
113
114
+static void coroutine_fn blk_aio_zone_append_entry(void *opaque)
115
+{
116
+ BlkAioEmAIOCB *acb = opaque;
117
+ BlkRwCo *rwco = &acb->rwco;
118
+
119
+ rwco->ret = blk_co_zone_append(rwco->blk, (int64_t *)acb->bytes,
120
+ rwco->iobuf, rwco->flags);
121
+ blk_aio_complete(acb);
122
+}
123
+
124
+BlockAIOCB *blk_aio_zone_append(BlockBackend *blk, int64_t *offset,
125
+ QEMUIOVector *qiov, BdrvRequestFlags flags,
126
+ BlockCompletionFunc *cb, void *opaque) {
127
+ BlkAioEmAIOCB *acb;
128
+ Coroutine *co;
129
+ IO_CODE();
130
+
131
+ blk_inc_in_flight(blk);
132
+ acb = blk_aio_get(&blk_aio_em_aiocb_info, blk, cb, opaque);
133
+ acb->rwco = (BlkRwCo) {
134
+ .blk = blk,
135
+ .ret = NOT_DONE,
136
+ .flags = flags,
137
+ .iobuf = qiov,
138
+ };
139
+ acb->bytes = (int64_t)offset;
140
+ acb->has_returned = false;
141
+
142
+ co = qemu_coroutine_create(blk_aio_zone_append_entry, acb);
143
+ aio_co_enter(blk_get_aio_context(blk), co);
144
+ acb->has_returned = true;
145
+ if (acb->rwco.ret != NOT_DONE) {
146
+ replay_bh_schedule_oneshot_event(blk_get_aio_context(blk),
147
+ blk_aio_complete_bh, acb);
148
+ }
149
+
150
+ return &acb->common;
151
+}
152
+
153
/*
154
* Send a zone_report command.
155
* offset is a byte offset from the start of the device. No alignment
156
@@ -XXX,XX +XXX,XX @@ int coroutine_fn blk_co_zone_mgmt(BlockBackend *blk, BlockZoneOp op,
157
return ret;
158
}
159
160
+/*
161
+ * Send a zone_append command.
162
+ */
163
+int coroutine_fn blk_co_zone_append(BlockBackend *blk, int64_t *offset,
164
+ QEMUIOVector *qiov, BdrvRequestFlags flags)
165
+{
166
+ int ret;
167
+ IO_CODE();
168
+
169
+ blk_inc_in_flight(blk);
170
+ blk_wait_while_drained(blk);
171
+ if (!blk_is_available(blk)) {
172
+ blk_dec_in_flight(blk);
173
+ return -ENOMEDIUM;
174
+ }
175
+
176
+ ret = bdrv_co_zone_append(blk_bs(blk), offset, qiov, flags);
177
+ blk_dec_in_flight(blk);
178
+ return ret;
179
+}
180
+
181
void blk_drain(BlockBackend *blk)
182
{
183
BlockDriverState *bs = blk_bs(blk);
184
diff --git a/block/file-posix.c b/block/file-posix.c
185
index XXXXXXX..XXXXXXX 100644
186
--- a/block/file-posix.c
187
+++ b/block/file-posix.c
188
@@ -XXX,XX +XXX,XX @@ typedef struct BDRVRawState {
189
bool has_write_zeroes:1;
190
bool use_linux_aio:1;
191
bool use_linux_io_uring:1;
192
+ int64_t *offset; /* offset of zone append operation */
193
int page_cache_inconsistent; /* errno from fdatasync failure */
194
bool has_fallocate;
195
bool needs_alignment;
196
@@ -XXX,XX +XXX,XX @@ static ssize_t handle_aiocb_rw_vector(RawPosixAIOData *aiocb)
197
ssize_t len;
198
199
len = RETRY_ON_EINTR(
200
- (aiocb->aio_type & QEMU_AIO_WRITE) ?
201
+ (aiocb->aio_type & (QEMU_AIO_WRITE | QEMU_AIO_ZONE_APPEND)) ?
202
qemu_pwritev(aiocb->aio_fildes,
203
aiocb->io.iov,
204
aiocb->io.niov,
205
@@ -XXX,XX +XXX,XX @@ static ssize_t handle_aiocb_rw_linear(RawPosixAIOData *aiocb, char *buf)
206
ssize_t len;
207
208
while (offset < aiocb->aio_nbytes) {
209
- if (aiocb->aio_type & QEMU_AIO_WRITE) {
210
+ if (aiocb->aio_type & (QEMU_AIO_WRITE | QEMU_AIO_ZONE_APPEND)) {
211
len = pwrite(aiocb->aio_fildes,
212
(const char *)buf + offset,
213
aiocb->aio_nbytes - offset,
214
@@ -XXX,XX +XXX,XX @@ static int handle_aiocb_rw(void *opaque)
215
}
216
217
nbytes = handle_aiocb_rw_linear(aiocb, buf);
218
- if (!(aiocb->aio_type & QEMU_AIO_WRITE)) {
219
+ if (!(aiocb->aio_type & (QEMU_AIO_WRITE | QEMU_AIO_ZONE_APPEND))) {
220
char *p = buf;
221
size_t count = aiocb->aio_nbytes, copy;
222
int i;
223
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn raw_co_prw(BlockDriverState *bs, uint64_t offset,
224
if (fd_open(bs) < 0)
225
return -EIO;
226
#if defined(CONFIG_BLKZONED)
227
- if (type & QEMU_AIO_WRITE && bs->wps) {
228
+ if ((type & (QEMU_AIO_WRITE | QEMU_AIO_ZONE_APPEND)) && bs->wps) {
229
qemu_co_mutex_lock(&bs->wps->colock);
230
+ if (type & QEMU_AIO_ZONE_APPEND && bs->bl.zone_size) {
231
+ int index = offset / bs->bl.zone_size;
232
+ offset = bs->wps->wp[index];
233
+ }
234
}
235
#endif
236
237
@@ -XXX,XX +XXX,XX @@ out:
238
#if defined(CONFIG_BLKZONED)
239
BlockZoneWps *wps = bs->wps;
240
if (ret == 0) {
241
- if (type & QEMU_AIO_WRITE && wps && bs->bl.zone_size) {
242
+ if ((type & (QEMU_AIO_WRITE | QEMU_AIO_ZONE_APPEND))
243
+ && wps && bs->bl.zone_size) {
244
uint64_t *wp = &wps->wp[offset / bs->bl.zone_size];
245
if (!BDRV_ZT_IS_CONV(*wp)) {
246
+ if (type & QEMU_AIO_ZONE_APPEND) {
247
+ *s->offset = *wp;
248
+ }
249
/* Advance the wp if needed */
250
if (offset + bytes > *wp) {
251
*wp = offset + bytes;
252
@@ -XXX,XX +XXX,XX @@ out:
253
}
254
}
255
} else {
256
- if (type & QEMU_AIO_WRITE) {
257
+ if (type & (QEMU_AIO_WRITE | QEMU_AIO_ZONE_APPEND)) {
258
update_zones_wp(bs, s->fd, 0, 1);
259
}
260
}
261
262
- if (type & QEMU_AIO_WRITE && wps) {
263
+ if ((type & (QEMU_AIO_WRITE | QEMU_AIO_ZONE_APPEND)) && wps) {
264
qemu_co_mutex_unlock(&wps->colock);
265
}
266
#endif
267
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn raw_co_zone_mgmt(BlockDriverState *bs, BlockZoneOp op,
268
}
269
#endif
270
271
+#if defined(CONFIG_BLKZONED)
272
+static int coroutine_fn raw_co_zone_append(BlockDriverState *bs,
273
+ int64_t *offset,
274
+ QEMUIOVector *qiov,
275
+ BdrvRequestFlags flags) {
276
+ assert(flags == 0);
277
+ int64_t zone_size_mask = bs->bl.zone_size - 1;
278
+ int64_t iov_len = 0;
279
+ int64_t len = 0;
280
+ BDRVRawState *s = bs->opaque;
281
+ s->offset = offset;
282
+
283
+ if (*offset & zone_size_mask) {
284
+ error_report("sector offset %" PRId64 " is not aligned to zone size "
285
+ "%" PRId32 "", *offset / 512, bs->bl.zone_size / 512);
286
+ return -EINVAL;
287
+ }
288
+
289
+ int64_t wg = bs->bl.write_granularity;
290
+ int64_t wg_mask = wg - 1;
291
+ for (int i = 0; i < qiov->niov; i++) {
292
+ iov_len = qiov->iov[i].iov_len;
293
+ if (iov_len & wg_mask) {
294
+ error_report("len of IOVector[%d] %" PRId64 " is not aligned to "
295
+ "block size %" PRId64 "", i, iov_len, wg);
296
+ return -EINVAL;
297
+ }
298
+ len += iov_len;
299
+ }
300
+
301
+ return raw_co_prw(bs, *offset, len, qiov, QEMU_AIO_ZONE_APPEND);
302
+}
303
+#endif
304
+
305
static coroutine_fn int
306
raw_do_pdiscard(BlockDriverState *bs, int64_t offset, int64_t bytes,
307
bool blkdev)
308
@@ -XXX,XX +XXX,XX @@ static BlockDriver bdrv_host_device = {
309
/* zone management operations */
310
.bdrv_co_zone_report = raw_co_zone_report,
311
.bdrv_co_zone_mgmt = raw_co_zone_mgmt,
312
+ .bdrv_co_zone_append = raw_co_zone_append,
313
#endif
314
};
315
316
diff --git a/block/io.c b/block/io.c
317
index XXXXXXX..XXXXXXX 100644
318
--- a/block/io.c
319
+++ b/block/io.c
320
@@ -XXX,XX +XXX,XX @@ out:
321
return co.ret;
322
}
323
324
+int coroutine_fn bdrv_co_zone_append(BlockDriverState *bs, int64_t *offset,
325
+ QEMUIOVector *qiov,
326
+ BdrvRequestFlags flags)
327
+{
328
+ int ret;
329
+ BlockDriver *drv = bs->drv;
330
+ CoroutineIOCompletion co = {
331
+ .coroutine = qemu_coroutine_self(),
332
+ };
333
+ IO_CODE();
334
+
335
+ ret = bdrv_check_qiov_request(*offset, qiov->size, qiov, 0, NULL);
336
+ if (ret < 0) {
337
+ return ret;
338
+ }
339
+
340
+ bdrv_inc_in_flight(bs);
341
+ if (!drv || !drv->bdrv_co_zone_append || bs->bl.zoned == BLK_Z_NONE) {
342
+ co.ret = -ENOTSUP;
343
+ goto out;
344
+ }
345
+ co.ret = drv->bdrv_co_zone_append(bs, offset, qiov, flags);
346
+out:
347
+ bdrv_dec_in_flight(bs);
348
+ return co.ret;
349
+}
350
+
351
void *qemu_blockalign(BlockDriverState *bs, size_t size)
352
{
353
IO_CODE();
354
diff --git a/block/io_uring.c b/block/io_uring.c
355
index XXXXXXX..XXXXXXX 100644
356
--- a/block/io_uring.c
357
+++ b/block/io_uring.c
358
@@ -XXX,XX +XXX,XX @@ static int luring_do_submit(int fd, LuringAIOCB *luringcb, LuringState *s,
359
io_uring_prep_writev(sqes, fd, luringcb->qiov->iov,
360
luringcb->qiov->niov, offset);
361
break;
362
+ case QEMU_AIO_ZONE_APPEND:
363
+ io_uring_prep_writev(sqes, fd, luringcb->qiov->iov,
364
+ luringcb->qiov->niov, offset);
365
+ break;
366
case QEMU_AIO_READ:
367
io_uring_prep_readv(sqes, fd, luringcb->qiov->iov,
368
luringcb->qiov->niov, offset);
369
diff --git a/block/linux-aio.c b/block/linux-aio.c
370
index XXXXXXX..XXXXXXX 100644
371
--- a/block/linux-aio.c
372
+++ b/block/linux-aio.c
373
@@ -XXX,XX +XXX,XX @@ static int laio_do_submit(int fd, struct qemu_laiocb *laiocb, off_t offset,
374
case QEMU_AIO_WRITE:
375
io_prep_pwritev(iocbs, fd, qiov->iov, qiov->niov, offset);
376
break;
377
+ case QEMU_AIO_ZONE_APPEND:
378
+ io_prep_pwritev(iocbs, fd, qiov->iov, qiov->niov, offset);
379
+ break;
380
case QEMU_AIO_READ:
381
io_prep_preadv(iocbs, fd, qiov->iov, qiov->niov, offset);
382
break;
383
diff --git a/block/raw-format.c b/block/raw-format.c
384
index XXXXXXX..XXXXXXX 100644
385
--- a/block/raw-format.c
386
+++ b/block/raw-format.c
387
@@ -XXX,XX +XXX,XX @@ raw_co_zone_mgmt(BlockDriverState *bs, BlockZoneOp op,
388
return bdrv_co_zone_mgmt(bs->file->bs, op, offset, len);
389
}
390
391
+static int coroutine_fn GRAPH_RDLOCK
392
+raw_co_zone_append(BlockDriverState *bs,int64_t *offset, QEMUIOVector *qiov,
393
+ BdrvRequestFlags flags)
394
+{
395
+ return bdrv_co_zone_append(bs->file->bs, offset, qiov, flags);
396
+}
397
+
398
static int64_t coroutine_fn GRAPH_RDLOCK
399
raw_co_getlength(BlockDriverState *bs)
400
{
401
@@ -XXX,XX +XXX,XX @@ BlockDriver bdrv_raw = {
402
.bdrv_co_pdiscard = &raw_co_pdiscard,
403
.bdrv_co_zone_report = &raw_co_zone_report,
404
.bdrv_co_zone_mgmt = &raw_co_zone_mgmt,
405
+ .bdrv_co_zone_append = &raw_co_zone_append,
406
.bdrv_co_block_status = &raw_co_block_status,
407
.bdrv_co_copy_range_from = &raw_co_copy_range_from,
408
.bdrv_co_copy_range_to = &raw_co_copy_range_to,
409
--
410
2.39.2
diff view generated by jsdifflib
1
The GLib documentation says "a NULL-terminated array of GOptionEntrys"
1
From: Sam Li <faithilikerun@gmail.com>
2
so we'd better make sure there is a terminator that lets
3
g_option_context_add_main_entries() know when the end of the array has
4
been reached.
5
2
6
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
3
The patch tests zone append writes by reporting the zone wp after
7
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
4
the completion of the call. "zap -p" option can print the sector
8
Message-id: 20220411150057.3009667-1-stefanha@redhat.com
5
offset value after completion, which should be the start sector
6
where the append write begins.
7
8
Signed-off-by: Sam Li <faithilikerun@gmail.com>
9
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
10
Message-id: 20230407081657.17947-4-faithilikerun@gmail.com
9
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
11
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10
---
12
---
11
contrib/vhost-user-blk/vhost-user-blk.c | 3 ++-
13
qemu-io-cmds.c | 75 ++++++++++++++++++++++++++++++
12
1 file changed, 2 insertions(+), 1 deletion(-)
14
tests/qemu-iotests/tests/zoned | 16 +++++++
15
tests/qemu-iotests/tests/zoned.out | 16 +++++++
16
3 files changed, 107 insertions(+)
13
17
14
diff --git a/contrib/vhost-user-blk/vhost-user-blk.c b/contrib/vhost-user-blk/vhost-user-blk.c
18
diff --git a/qemu-io-cmds.c b/qemu-io-cmds.c
15
index XXXXXXX..XXXXXXX 100644
19
index XXXXXXX..XXXXXXX 100644
16
--- a/contrib/vhost-user-blk/vhost-user-blk.c
20
--- a/qemu-io-cmds.c
17
+++ b/contrib/vhost-user-blk/vhost-user-blk.c
21
+++ b/qemu-io-cmds.c
18
@@ -XXX,XX +XXX,XX @@ static GOptionEntry entries[] = {
22
@@ -XXX,XX +XXX,XX @@ static const cmdinfo_t zone_reset_cmd = {
19
{"blk-file", 'b', 0, G_OPTION_ARG_FILENAME, &opt_blk_file,
23
.oneline = "reset a zone write pointer in zone block device",
20
"block device or file path", "PATH"},
21
{ "read-only", 'r', 0, G_OPTION_ARG_NONE, &opt_read_only,
22
- "Enable read-only", NULL }
23
+ "Enable read-only", NULL },
24
+ { NULL, },
25
};
24
};
26
25
27
int main(int argc, char **argv)
26
+static int do_aio_zone_append(BlockBackend *blk, QEMUIOVector *qiov,
27
+ int64_t *offset, int flags, int *total)
28
+{
29
+ int async_ret = NOT_DONE;
30
+
31
+ blk_aio_zone_append(blk, offset, qiov, flags, aio_rw_done, &async_ret);
32
+ while (async_ret == NOT_DONE) {
33
+ main_loop_wait(false);
34
+ }
35
+
36
+ *total = qiov->size;
37
+ return async_ret < 0 ? async_ret : 1;
38
+}
39
+
40
+static int zone_append_f(BlockBackend *blk, int argc, char **argv)
41
+{
42
+ int ret;
43
+ bool pflag = false;
44
+ int flags = 0;
45
+ int total = 0;
46
+ int64_t offset;
47
+ char *buf;
48
+ int c, nr_iov;
49
+ int pattern = 0xcd;
50
+ QEMUIOVector qiov;
51
+
52
+ if (optind > argc - 3) {
53
+ return -EINVAL;
54
+ }
55
+
56
+ if ((c = getopt(argc, argv, "p")) != -1) {
57
+ pflag = true;
58
+ }
59
+
60
+ offset = cvtnum(argv[optind]);
61
+ if (offset < 0) {
62
+ print_cvtnum_err(offset, argv[optind]);
63
+ return offset;
64
+ }
65
+ optind++;
66
+ nr_iov = argc - optind;
67
+ buf = create_iovec(blk, &qiov, &argv[optind], nr_iov, pattern,
68
+ flags & BDRV_REQ_REGISTERED_BUF);
69
+ if (buf == NULL) {
70
+ return -EINVAL;
71
+ }
72
+ ret = do_aio_zone_append(blk, &qiov, &offset, flags, &total);
73
+ if (ret < 0) {
74
+ printf("zone append failed: %s\n", strerror(-ret));
75
+ goto out;
76
+ }
77
+
78
+ if (pflag) {
79
+ printf("After zap done, the append sector is 0x%" PRIx64 "\n",
80
+ tosector(offset));
81
+ }
82
+
83
+out:
84
+ qemu_io_free(blk, buf, qiov.size,
85
+ flags & BDRV_REQ_REGISTERED_BUF);
86
+ qemu_iovec_destroy(&qiov);
87
+ return ret;
88
+}
89
+
90
+static const cmdinfo_t zone_append_cmd = {
91
+ .name = "zone_append",
92
+ .altname = "zap",
93
+ .cfunc = zone_append_f,
94
+ .argmin = 3,
95
+ .argmax = 4,
96
+ .args = "offset len [len..]",
97
+ .oneline = "append write a number of bytes at a specified offset",
98
+};
99
+
100
static int truncate_f(BlockBackend *blk, int argc, char **argv);
101
static const cmdinfo_t truncate_cmd = {
102
.name = "truncate",
103
@@ -XXX,XX +XXX,XX @@ static void __attribute((constructor)) init_qemuio_commands(void)
104
qemuio_add_command(&zone_close_cmd);
105
qemuio_add_command(&zone_finish_cmd);
106
qemuio_add_command(&zone_reset_cmd);
107
+ qemuio_add_command(&zone_append_cmd);
108
qemuio_add_command(&truncate_cmd);
109
qemuio_add_command(&length_cmd);
110
qemuio_add_command(&info_cmd);
111
diff --git a/tests/qemu-iotests/tests/zoned b/tests/qemu-iotests/tests/zoned
112
index XXXXXXX..XXXXXXX 100755
113
--- a/tests/qemu-iotests/tests/zoned
114
+++ b/tests/qemu-iotests/tests/zoned
115
@@ -XXX,XX +XXX,XX @@ echo "(5) resetting the second zone"
116
$QEMU_IO $IMG -c "zrs 268435456 268435456"
117
echo "After resetting a zone:"
118
$QEMU_IO $IMG -c "zrp 268435456 1"
119
+echo
120
+echo
121
+echo "(6) append write" # the physical block size of the device is 4096
122
+$QEMU_IO $IMG -c "zrp 0 1"
123
+$QEMU_IO $IMG -c "zap -p 0 0x1000 0x2000"
124
+echo "After appending the first zone firstly:"
125
+$QEMU_IO $IMG -c "zrp 0 1"
126
+$QEMU_IO $IMG -c "zap -p 0 0x1000 0x2000"
127
+echo "After appending the first zone secondly:"
128
+$QEMU_IO $IMG -c "zrp 0 1"
129
+$QEMU_IO $IMG -c "zap -p 268435456 0x1000 0x2000"
130
+echo "After appending the second zone firstly:"
131
+$QEMU_IO $IMG -c "zrp 268435456 1"
132
+$QEMU_IO $IMG -c "zap -p 268435456 0x1000 0x2000"
133
+echo "After appending the second zone secondly:"
134
+$QEMU_IO $IMG -c "zrp 268435456 1"
135
136
# success, all done
137
echo "*** done"
138
diff --git a/tests/qemu-iotests/tests/zoned.out b/tests/qemu-iotests/tests/zoned.out
139
index XXXXXXX..XXXXXXX 100644
140
--- a/tests/qemu-iotests/tests/zoned.out
141
+++ b/tests/qemu-iotests/tests/zoned.out
142
@@ -XXX,XX +XXX,XX @@ start: 0x80000, len 0x80000, cap 0x80000, wptr 0x100000, zcond:14, [type: 2]
143
(5) resetting the second zone
144
After resetting a zone:
145
start: 0x80000, len 0x80000, cap 0x80000, wptr 0x80000, zcond:1, [type: 2]
146
+
147
+
148
+(6) append write
149
+start: 0x0, len 0x80000, cap 0x80000, wptr 0x0, zcond:1, [type: 2]
150
+After zap done, the append sector is 0x0
151
+After appending the first zone firstly:
152
+start: 0x0, len 0x80000, cap 0x80000, wptr 0x18, zcond:2, [type: 2]
153
+After zap done, the append sector is 0x18
154
+After appending the first zone secondly:
155
+start: 0x0, len 0x80000, cap 0x80000, wptr 0x30, zcond:2, [type: 2]
156
+After zap done, the append sector is 0x80000
157
+After appending the second zone firstly:
158
+start: 0x80000, len 0x80000, cap 0x80000, wptr 0x80018, zcond:2, [type: 2]
159
+After zap done, the append sector is 0x80018
160
+After appending the second zone secondly:
161
+start: 0x80000, len 0x80000, cap 0x80000, wptr 0x80030, zcond:2, [type: 2]
162
*** done
28
--
163
--
29
2.35.1
164
2.39.2
diff view generated by jsdifflib
New patch
1
From: Sam Li <faithilikerun@gmail.com>
1
2
3
Signed-off-by: Sam Li <faithilikerun@gmail.com>
4
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
5
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
6
Message-id: 20230407081657.17947-5-faithilikerun@gmail.com
7
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
8
---
9
block/file-posix.c | 3 +++
10
block/trace-events | 2 ++
11
2 files changed, 5 insertions(+)
12
13
diff --git a/block/file-posix.c b/block/file-posix.c
14
index XXXXXXX..XXXXXXX 100644
15
--- a/block/file-posix.c
16
+++ b/block/file-posix.c
17
@@ -XXX,XX +XXX,XX @@ out:
18
if (!BDRV_ZT_IS_CONV(*wp)) {
19
if (type & QEMU_AIO_ZONE_APPEND) {
20
*s->offset = *wp;
21
+ trace_zbd_zone_append_complete(bs, *s->offset
22
+ >> BDRV_SECTOR_BITS);
23
}
24
/* Advance the wp if needed */
25
if (offset + bytes > *wp) {
26
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn raw_co_zone_append(BlockDriverState *bs,
27
len += iov_len;
28
}
29
30
+ trace_zbd_zone_append(bs, *offset >> BDRV_SECTOR_BITS);
31
return raw_co_prw(bs, *offset, len, qiov, QEMU_AIO_ZONE_APPEND);
32
}
33
#endif
34
diff --git a/block/trace-events b/block/trace-events
35
index XXXXXXX..XXXXXXX 100644
36
--- a/block/trace-events
37
+++ b/block/trace-events
38
@@ -XXX,XX +XXX,XX @@ file_hdev_is_sg(int type, int version) "SG device found: type=%d, version=%d"
39
file_flush_fdatasync_failed(int err) "errno %d"
40
zbd_zone_report(void *bs, unsigned int nr_zones, int64_t sector) "bs %p report %d zones starting at sector offset 0x%" PRIx64 ""
41
zbd_zone_mgmt(void *bs, const char *op_name, int64_t sector, int64_t len) "bs %p %s starts at sector offset 0x%" PRIx64 " over a range of 0x%" PRIx64 " sectors"
42
+zbd_zone_append(void *bs, int64_t sector) "bs %p append at sector offset 0x%" PRIx64 ""
43
+zbd_zone_append_complete(void *bs, int64_t sector) "bs %p returns append sector 0x%" PRIx64 ""
44
45
# ssh.c
46
sftp_error(const char *op, const char *ssh_err, int ssh_err_code, int sftp_err_code) "%s failed: %s (libssh error code: %d, sftp error code: %d)"
47
--
48
2.39.2
diff view generated by jsdifflib
New patch
1
From: Sam Li <faithilikerun@gmail.com>
1
2
3
Use scripts/update-linux-headers.sh to update headers to 6.3-rc1.
4
5
Signed-off-by: Sam Li <faithilikerun@gmail.com>
6
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
7
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
8
Message-id: 20230407082528.18841-2-faithilikerun@gmail.com
9
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10
---
11
include/standard-headers/drm/drm_fourcc.h | 12 +++
12
include/standard-headers/linux/ethtool.h | 48 ++++++++-
13
include/standard-headers/linux/fuse.h | 45 +++++++-
14
include/standard-headers/linux/pci_regs.h | 1 +
15
include/standard-headers/linux/vhost_types.h | 2 +
16
include/standard-headers/linux/virtio_blk.h | 105 +++++++++++++++++++
17
linux-headers/asm-arm64/kvm.h | 1 +
18
linux-headers/asm-x86/kvm.h | 34 +++++-
19
linux-headers/linux/kvm.h | 9 ++
20
linux-headers/linux/vfio.h | 15 +--
21
linux-headers/linux/vhost.h | 8 ++
22
11 files changed, 270 insertions(+), 10 deletions(-)
23
24
diff --git a/include/standard-headers/drm/drm_fourcc.h b/include/standard-headers/drm/drm_fourcc.h
25
index XXXXXXX..XXXXXXX 100644
26
--- a/include/standard-headers/drm/drm_fourcc.h
27
+++ b/include/standard-headers/drm/drm_fourcc.h
28
@@ -XXX,XX +XXX,XX @@ extern "C" {
29
*
30
* The authoritative list of format modifier codes is found in
31
* `include/uapi/drm/drm_fourcc.h`
32
+ *
33
+ * Open Source User Waiver
34
+ * -----------------------
35
+ *
36
+ * Because this is the authoritative source for pixel formats and modifiers
37
+ * referenced by GL, Vulkan extensions and other standards and hence used both
38
+ * by open source and closed source driver stacks, the usual requirement for an
39
+ * upstream in-kernel or open source userspace user does not apply.
40
+ *
41
+ * To ensure, as much as feasible, compatibility across stacks and avoid
42
+ * confusion with incompatible enumerations stakeholders for all relevant driver
43
+ * stacks should approve additions.
44
*/
45
46
#define fourcc_code(a, b, c, d) ((uint32_t)(a) | ((uint32_t)(b) << 8) | \
47
diff --git a/include/standard-headers/linux/ethtool.h b/include/standard-headers/linux/ethtool.h
48
index XXXXXXX..XXXXXXX 100644
49
--- a/include/standard-headers/linux/ethtool.h
50
+++ b/include/standard-headers/linux/ethtool.h
51
@@ -XXX,XX +XXX,XX @@ enum ethtool_stringset {
52
    ETH_SS_COUNT
53
};
54
55
+/**
56
+ * enum ethtool_mac_stats_src - source of ethtool MAC statistics
57
+ * @ETHTOOL_MAC_STATS_SRC_AGGREGATE:
58
+ *    if device supports a MAC merge layer, this retrieves the aggregate
59
+ *    statistics of the eMAC and pMAC. Otherwise, it retrieves just the
60
+ *    statistics of the single (express) MAC.
61
+ * @ETHTOOL_MAC_STATS_SRC_EMAC:
62
+ *    if device supports a MM layer, this retrieves the eMAC statistics.
63
+ *    Otherwise, it retrieves the statistics of the single (express) MAC.
64
+ * @ETHTOOL_MAC_STATS_SRC_PMAC:
65
+ *    if device supports a MM layer, this retrieves the pMAC statistics.
66
+ */
67
+enum ethtool_mac_stats_src {
68
+    ETHTOOL_MAC_STATS_SRC_AGGREGATE,
69
+    ETHTOOL_MAC_STATS_SRC_EMAC,
70
+    ETHTOOL_MAC_STATS_SRC_PMAC,
71
+};
72
+
73
/**
74
* enum ethtool_module_power_mode_policy - plug-in module power mode policy
75
* @ETHTOOL_MODULE_POWER_MODE_POLICY_HIGH: Module is always in high power mode.
76
@@ -XXX,XX +XXX,XX @@ enum ethtool_podl_pse_pw_d_status {
77
    ETHTOOL_PODL_PSE_PW_D_STATUS_ERROR,
78
};
79
80
+/**
81
+ * enum ethtool_mm_verify_status - status of MAC Merge Verify function
82
+ * @ETHTOOL_MM_VERIFY_STATUS_UNKNOWN:
83
+ *    verification status is unknown
84
+ * @ETHTOOL_MM_VERIFY_STATUS_INITIAL:
85
+ *    the 802.3 Verify State diagram is in the state INIT_VERIFICATION
86
+ * @ETHTOOL_MM_VERIFY_STATUS_VERIFYING:
87
+ *    the Verify State diagram is in the state VERIFICATION_IDLE,
88
+ *    SEND_VERIFY or WAIT_FOR_RESPONSE
89
+ * @ETHTOOL_MM_VERIFY_STATUS_SUCCEEDED:
90
+ *    indicates that the Verify State diagram is in the state VERIFIED
91
+ * @ETHTOOL_MM_VERIFY_STATUS_FAILED:
92
+ *    the Verify State diagram is in the state VERIFY_FAIL
93
+ * @ETHTOOL_MM_VERIFY_STATUS_DISABLED:
94
+ *    verification of preemption operation is disabled
95
+ */
96
+enum ethtool_mm_verify_status {
97
+    ETHTOOL_MM_VERIFY_STATUS_UNKNOWN,
98
+    ETHTOOL_MM_VERIFY_STATUS_INITIAL,
99
+    ETHTOOL_MM_VERIFY_STATUS_VERIFYING,
100
+    ETHTOOL_MM_VERIFY_STATUS_SUCCEEDED,
101
+    ETHTOOL_MM_VERIFY_STATUS_FAILED,
102
+    ETHTOOL_MM_VERIFY_STATUS_DISABLED,
103
+};
104
+
105
/**
106
* struct ethtool_gstrings - string set for data tagging
107
* @cmd: Command number = %ETHTOOL_GSTRINGS
108
@@ -XXX,XX +XXX,XX @@ struct ethtool_rxnfc {
109
        uint32_t            rule_cnt;
110
        uint32_t            rss_context;
111
    };
112
-    uint32_t                rule_locs[0];
113
+    uint32_t                rule_locs[];
114
};
115
116
117
@@ -XXX,XX +XXX,XX @@ enum ethtool_link_mode_bit_indices {
118
    ETHTOOL_LINK_MODE_800000baseDR8_2_Full_BIT     = 96,
119
    ETHTOOL_LINK_MODE_800000baseSR8_Full_BIT     = 97,
120
    ETHTOOL_LINK_MODE_800000baseVR8_Full_BIT     = 98,
121
+    ETHTOOL_LINK_MODE_10baseT1S_Full_BIT         = 99,
122
+    ETHTOOL_LINK_MODE_10baseT1S_Half_BIT         = 100,
123
+    ETHTOOL_LINK_MODE_10baseT1S_P2MP_Half_BIT     = 101,
124
125
    /* must be last entry */
126
    __ETHTOOL_LINK_MODE_MASK_NBITS
127
diff --git a/include/standard-headers/linux/fuse.h b/include/standard-headers/linux/fuse.h
128
index XXXXXXX..XXXXXXX 100644
129
--- a/include/standard-headers/linux/fuse.h
130
+++ b/include/standard-headers/linux/fuse.h
131
@@ -XXX,XX +XXX,XX @@
132
* 7.38
133
* - add FUSE_EXPIRE_ONLY flag to fuse_notify_inval_entry
134
* - add FOPEN_PARALLEL_DIRECT_WRITES
135
+ * - add total_extlen to fuse_in_header
136
+ * - add FUSE_MAX_NR_SECCTX
137
+ * - add extension header
138
+ * - add FUSE_EXT_GROUPS
139
+ * - add FUSE_CREATE_SUPP_GROUP
140
*/
141
142
#ifndef _LINUX_FUSE_H
143
@@ -XXX,XX +XXX,XX @@ struct fuse_file_lock {
144
* FUSE_SECURITY_CTX:    add security context to create, mkdir, symlink, and
145
*            mknod
146
* FUSE_HAS_INODE_DAX: use per inode DAX
147
+ * FUSE_CREATE_SUPP_GROUP: add supplementary group info to create, mkdir,
148
+ *            symlink and mknod (single group that matches parent)
149
*/
150
#define FUSE_ASYNC_READ        (1 << 0)
151
#define FUSE_POSIX_LOCKS    (1 << 1)
152
@@ -XXX,XX +XXX,XX @@ struct fuse_file_lock {
153
/* bits 32..63 get shifted down 32 bits into the flags2 field */
154
#define FUSE_SECURITY_CTX    (1ULL << 32)
155
#define FUSE_HAS_INODE_DAX    (1ULL << 33)
156
+#define FUSE_CREATE_SUPP_GROUP    (1ULL << 34)
157
158
/**
159
* CUSE INIT request/reply flags
160
@@ -XXX,XX +XXX,XX @@ struct fuse_file_lock {
161
*/
162
#define FUSE_EXPIRE_ONLY        (1 << 0)
163
164
+/**
165
+ * extension type
166
+ * FUSE_MAX_NR_SECCTX: maximum value of &fuse_secctx_header.nr_secctx
167
+ * FUSE_EXT_GROUPS: &fuse_supp_groups extension
168
+ */
169
+enum fuse_ext_type {
170
+    /* Types 0..31 are reserved for fuse_secctx_header */
171
+    FUSE_MAX_NR_SECCTX    = 31,
172
+    FUSE_EXT_GROUPS        = 32,
173
+};
174
+
175
enum fuse_opcode {
176
    FUSE_LOOKUP        = 1,
177
    FUSE_FORGET        = 2, /* no reply */
178
@@ -XXX,XX +XXX,XX @@ struct fuse_in_header {
179
    uint32_t    uid;
180
    uint32_t    gid;
181
    uint32_t    pid;
182
-    uint32_t    padding;
183
+    uint16_t    total_extlen; /* length of extensions in 8byte units */
184
+    uint16_t    padding;
185
};
186
187
struct fuse_out_header {
188
@@ -XXX,XX +XXX,XX @@ struct fuse_secctx_header {
189
    uint32_t    nr_secctx;
190
};
191
192
+/**
193
+ * struct fuse_ext_header - extension header
194
+ * @size: total size of this extension including this header
195
+ * @type: type of extension
196
+ *
197
+ * This is made compatible with fuse_secctx_header by using type values >
198
+ * FUSE_MAX_NR_SECCTX
199
+ */
200
+struct fuse_ext_header {
201
+    uint32_t    size;
202
+    uint32_t    type;
203
+};
204
+
205
+/**
206
+ * struct fuse_supp_groups - Supplementary group extension
207
+ * @nr_groups: number of supplementary groups
208
+ * @groups: flexible array of group IDs
209
+ */
210
+struct fuse_supp_groups {
211
+    uint32_t    nr_groups;
212
+    uint32_t    groups[];
213
+};
214
+
215
#endif /* _LINUX_FUSE_H */
216
diff --git a/include/standard-headers/linux/pci_regs.h b/include/standard-headers/linux/pci_regs.h
217
index XXXXXXX..XXXXXXX 100644
218
--- a/include/standard-headers/linux/pci_regs.h
219
+++ b/include/standard-headers/linux/pci_regs.h
220
@@ -XXX,XX +XXX,XX @@
221
#define PCI_EXP_LNKCTL2_TX_MARGIN    0x0380 /* Transmit Margin */
222
#define PCI_EXP_LNKCTL2_HASD        0x0020 /* HW Autonomous Speed Disable */
223
#define PCI_EXP_LNKSTA2        0x32    /* Link Status 2 */
224
+#define PCI_EXP_LNKSTA2_FLIT        0x0400 /* Flit Mode Status */
225
#define PCI_CAP_EXP_ENDPOINT_SIZEOF_V2    0x32    /* end of v2 EPs w/ link */
226
#define PCI_EXP_SLTCAP2        0x34    /* Slot Capabilities 2 */
227
#define PCI_EXP_SLTCAP2_IBPD    0x00000001 /* In-band PD Disable Supported */
228
diff --git a/include/standard-headers/linux/vhost_types.h b/include/standard-headers/linux/vhost_types.h
229
index XXXXXXX..XXXXXXX 100644
230
--- a/include/standard-headers/linux/vhost_types.h
231
+++ b/include/standard-headers/linux/vhost_types.h
232
@@ -XXX,XX +XXX,XX @@ struct vhost_vdpa_iova_range {
233
#define VHOST_BACKEND_F_IOTLB_ASID 0x3
234
/* Device can be suspended */
235
#define VHOST_BACKEND_F_SUSPEND 0x4
236
+/* Device can be resumed */
237
+#define VHOST_BACKEND_F_RESUME 0x5
238
239
#endif
240
diff --git a/include/standard-headers/linux/virtio_blk.h b/include/standard-headers/linux/virtio_blk.h
241
index XXXXXXX..XXXXXXX 100644
242
--- a/include/standard-headers/linux/virtio_blk.h
243
+++ b/include/standard-headers/linux/virtio_blk.h
244
@@ -XXX,XX +XXX,XX @@
245
#define VIRTIO_BLK_F_DISCARD    13    /* DISCARD is supported */
246
#define VIRTIO_BLK_F_WRITE_ZEROES    14    /* WRITE ZEROES is supported */
247
#define VIRTIO_BLK_F_SECURE_ERASE    16 /* Secure Erase is supported */
248
+#define VIRTIO_BLK_F_ZONED        17    /* Zoned block device */
249
250
/* Legacy feature bits */
251
#ifndef VIRTIO_BLK_NO_LEGACY
252
@@ -XXX,XX +XXX,XX @@ struct virtio_blk_config {
253
    /* Secure erase commands must be aligned to this number of sectors. */
254
    __virtio32 secure_erase_sector_alignment;
255
256
+    /* Zoned block device characteristics (if VIRTIO_BLK_F_ZONED) */
257
+    struct virtio_blk_zoned_characteristics {
258
+        uint32_t zone_sectors;
259
+        uint32_t max_open_zones;
260
+        uint32_t max_active_zones;
261
+        uint32_t max_append_sectors;
262
+        uint32_t write_granularity;
263
+        uint8_t model;
264
+        uint8_t unused2[3];
265
+    } zoned;
266
} QEMU_PACKED;
267
268
/*
269
@@ -XXX,XX +XXX,XX @@ struct virtio_blk_config {
270
/* Secure erase command */
271
#define VIRTIO_BLK_T_SECURE_ERASE    14
272
273
+/* Zone append command */
274
+#define VIRTIO_BLK_T_ZONE_APPEND 15
275
+
276
+/* Report zones command */
277
+#define VIRTIO_BLK_T_ZONE_REPORT 16
278
+
279
+/* Open zone command */
280
+#define VIRTIO_BLK_T_ZONE_OPEN 18
281
+
282
+/* Close zone command */
283
+#define VIRTIO_BLK_T_ZONE_CLOSE 20
284
+
285
+/* Finish zone command */
286
+#define VIRTIO_BLK_T_ZONE_FINISH 22
287
+
288
+/* Reset zone command */
289
+#define VIRTIO_BLK_T_ZONE_RESET 24
290
+
291
+/* Reset All zones command */
292
+#define VIRTIO_BLK_T_ZONE_RESET_ALL 26
293
+
294
#ifndef VIRTIO_BLK_NO_LEGACY
295
/* Barrier before this op. */
296
#define VIRTIO_BLK_T_BARRIER    0x80000000
297
@@ -XXX,XX +XXX,XX @@ struct virtio_blk_outhdr {
298
    __virtio64 sector;
299
};
300
301
+/*
302
+ * Supported zoned device models.
303
+ */
304
+
305
+/* Regular block device */
306
+#define VIRTIO_BLK_Z_NONE 0
307
+/* Host-managed zoned device */
308
+#define VIRTIO_BLK_Z_HM 1
309
+/* Host-aware zoned device */
310
+#define VIRTIO_BLK_Z_HA 2
311
+
312
+/*
313
+ * Zone descriptor. A part of VIRTIO_BLK_T_ZONE_REPORT command reply.
314
+ */
315
+struct virtio_blk_zone_descriptor {
316
+    /* Zone capacity */
317
+    uint64_t z_cap;
318
+    /* The starting sector of the zone */
319
+    uint64_t z_start;
320
+    /* Zone write pointer position in sectors */
321
+    uint64_t z_wp;
322
+    /* Zone type */
323
+    uint8_t z_type;
324
+    /* Zone state */
325
+    uint8_t z_state;
326
+    uint8_t reserved[38];
327
+};
328
+
329
+struct virtio_blk_zone_report {
330
+    uint64_t nr_zones;
331
+    uint8_t reserved[56];
332
+    struct virtio_blk_zone_descriptor zones[];
333
+};
334
+
335
+/*
336
+ * Supported zone types.
337
+ */
338
+
339
+/* Conventional zone */
340
+#define VIRTIO_BLK_ZT_CONV 1
341
+/* Sequential Write Required zone */
342
+#define VIRTIO_BLK_ZT_SWR 2
343
+/* Sequential Write Preferred zone */
344
+#define VIRTIO_BLK_ZT_SWP 3
345
+
346
+/*
347
+ * Zone states that are available for zones of all types.
348
+ */
349
+
350
+/* Not a write pointer (conventional zones only) */
351
+#define VIRTIO_BLK_ZS_NOT_WP 0
352
+/* Empty */
353
+#define VIRTIO_BLK_ZS_EMPTY 1
354
+/* Implicitly Open */
355
+#define VIRTIO_BLK_ZS_IOPEN 2
356
+/* Explicitly Open */
357
+#define VIRTIO_BLK_ZS_EOPEN 3
358
+/* Closed */
359
+#define VIRTIO_BLK_ZS_CLOSED 4
360
+/* Read-Only */
361
+#define VIRTIO_BLK_ZS_RDONLY 13
362
+/* Full */
363
+#define VIRTIO_BLK_ZS_FULL 14
364
+/* Offline */
365
+#define VIRTIO_BLK_ZS_OFFLINE 15
366
+
367
/* Unmap this range (only valid for write zeroes command) */
368
#define VIRTIO_BLK_WRITE_ZEROES_FLAG_UNMAP    0x00000001
369
370
@@ -XXX,XX +XXX,XX @@ struct virtio_scsi_inhdr {
371
#define VIRTIO_BLK_S_OK        0
372
#define VIRTIO_BLK_S_IOERR    1
373
#define VIRTIO_BLK_S_UNSUPP    2
374
+
375
+/* Error codes that are specific to zoned block devices */
376
+#define VIRTIO_BLK_S_ZONE_INVALID_CMD 3
377
+#define VIRTIO_BLK_S_ZONE_UNALIGNED_WP 4
378
+#define VIRTIO_BLK_S_ZONE_OPEN_RESOURCE 5
379
+#define VIRTIO_BLK_S_ZONE_ACTIVE_RESOURCE 6
380
+
381
#endif /* _LINUX_VIRTIO_BLK_H */
382
diff --git a/linux-headers/asm-arm64/kvm.h b/linux-headers/asm-arm64/kvm.h
383
index XXXXXXX..XXXXXXX 100644
384
--- a/linux-headers/asm-arm64/kvm.h
385
+++ b/linux-headers/asm-arm64/kvm.h
386
@@ -XXX,XX +XXX,XX @@ struct kvm_regs {
387
#define KVM_ARM_VCPU_SVE        4 /* enable SVE for this CPU */
388
#define KVM_ARM_VCPU_PTRAUTH_ADDRESS    5 /* VCPU uses address authentication */
389
#define KVM_ARM_VCPU_PTRAUTH_GENERIC    6 /* VCPU uses generic authentication */
390
+#define KVM_ARM_VCPU_HAS_EL2        7 /* Support nested virtualization */
391
392
struct kvm_vcpu_init {
393
    __u32 target;
394
diff --git a/linux-headers/asm-x86/kvm.h b/linux-headers/asm-x86/kvm.h
395
index XXXXXXX..XXXXXXX 100644
396
--- a/linux-headers/asm-x86/kvm.h
397
+++ b/linux-headers/asm-x86/kvm.h
398
@@ -XXX,XX +XXX,XX @@
399
400
#include <linux/types.h>
401
#include <linux/ioctl.h>
402
+#include <linux/stddef.h>
403
404
#define KVM_PIO_PAGE_OFFSET 1
405
#define KVM_COALESCED_MMIO_PAGE_OFFSET 2
406
@@ -XXX,XX +XXX,XX @@ struct kvm_nested_state {
407
     * KVM_{GET,PUT}_NESTED_STATE ioctl values.
408
     */
409
    union {
410
-        struct kvm_vmx_nested_state_data vmx[0];
411
-        struct kvm_svm_nested_state_data svm[0];
412
+        __DECLARE_FLEX_ARRAY(struct kvm_vmx_nested_state_data, vmx);
413
+        __DECLARE_FLEX_ARRAY(struct kvm_svm_nested_state_data, svm);
414
    } data;
415
};
416
417
@@ -XXX,XX +XXX,XX @@ struct kvm_pmu_event_filter {
418
#define KVM_PMU_EVENT_ALLOW 0
419
#define KVM_PMU_EVENT_DENY 1
420
421
+#define KVM_PMU_EVENT_FLAG_MASKED_EVENTS BIT(0)
422
+#define KVM_PMU_EVENT_FLAGS_VALID_MASK (KVM_PMU_EVENT_FLAG_MASKED_EVENTS)
423
+
424
+/*
425
+ * Masked event layout.
426
+ * Bits Description
427
+ * ---- -----------
428
+ * 7:0 event select (low bits)
429
+ * 15:8 umask match
430
+ * 31:16 unused
431
+ * 35:32 event select (high bits)
432
+ * 36:54 unused
433
+ * 55 exclude bit
434
+ * 63:56 umask mask
435
+ */
436
+
437
+#define KVM_PMU_ENCODE_MASKED_ENTRY(event_select, mask, match, exclude) \
438
+    (((event_select) & 0xFFULL) | (((event_select) & 0XF00ULL) << 24) | \
439
+    (((mask) & 0xFFULL) << 56) | \
440
+    (((match) & 0xFFULL) << 8) | \
441
+    ((__u64)(!!(exclude)) << 55))
442
+
443
+#define KVM_PMU_MASKED_ENTRY_EVENT_SELECT \
444
+    (GENMASK_ULL(7, 0) | GENMASK_ULL(35, 32))
445
+#define KVM_PMU_MASKED_ENTRY_UMASK_MASK        (GENMASK_ULL(63, 56))
446
+#define KVM_PMU_MASKED_ENTRY_UMASK_MATCH    (GENMASK_ULL(15, 8))
447
+#define KVM_PMU_MASKED_ENTRY_EXCLUDE        (BIT_ULL(55))
448
+#define KVM_PMU_MASKED_ENTRY_UMASK_MASK_SHIFT    (56)
449
+
450
/* for KVM_{GET,SET,HAS}_DEVICE_ATTR */
451
#define KVM_VCPU_TSC_CTRL 0 /* control group for the timestamp counter (TSC) */
452
#define KVM_VCPU_TSC_OFFSET 0 /* attribute for the TSC offset */
453
diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
454
index XXXXXXX..XXXXXXX 100644
455
--- a/linux-headers/linux/kvm.h
456
+++ b/linux-headers/linux/kvm.h
457
@@ -XXX,XX +XXX,XX @@ struct kvm_s390_mem_op {
458
        struct {
459
            __u8 ar;    /* the access register number */
460
            __u8 key;    /* access key, ignored if flag unset */
461
+            __u8 pad1[6];    /* ignored */
462
+            __u64 old_addr;    /* ignored if cmpxchg flag unset */
463
        };
464
        __u32 sida_offset; /* offset into the sida */
465
        __u8 reserved[32]; /* ignored */
466
@@ -XXX,XX +XXX,XX @@ struct kvm_s390_mem_op {
467
#define KVM_S390_MEMOP_SIDA_WRITE    3
468
#define KVM_S390_MEMOP_ABSOLUTE_READ    4
469
#define KVM_S390_MEMOP_ABSOLUTE_WRITE    5
470
+#define KVM_S390_MEMOP_ABSOLUTE_CMPXCHG    6
471
+
472
/* flags for kvm_s390_mem_op->flags */
473
#define KVM_S390_MEMOP_F_CHECK_ONLY        (1ULL << 0)
474
#define KVM_S390_MEMOP_F_INJECT_EXCEPTION    (1ULL << 1)
475
#define KVM_S390_MEMOP_F_SKEY_PROTECTION    (1ULL << 2)
476
477
+/* flags specifying extension support via KVM_CAP_S390_MEM_OP_EXTENSION */
478
+#define KVM_S390_MEMOP_EXTENSION_CAP_BASE    (1 << 0)
479
+#define KVM_S390_MEMOP_EXTENSION_CAP_CMPXCHG    (1 << 1)
480
+
481
/* for KVM_INTERRUPT */
482
struct kvm_interrupt {
483
    /* in */
484
@@ -XXX,XX +XXX,XX @@ struct kvm_ppc_resize_hpt {
485
#define KVM_CAP_DIRTY_LOG_RING_ACQ_REL 223
486
#define KVM_CAP_S390_PROTECTED_ASYNC_DISABLE 224
487
#define KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP 225
488
+#define KVM_CAP_PMU_EVENT_MASKED_EVENTS 226
489
490
#ifdef KVM_CAP_IRQ_ROUTING
491
492
diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
493
index XXXXXXX..XXXXXXX 100644
494
--- a/linux-headers/linux/vfio.h
495
+++ b/linux-headers/linux/vfio.h
496
@@ -XXX,XX +XXX,XX @@
497
/* Supports VFIO_DMA_UNMAP_FLAG_ALL */
498
#define VFIO_UNMAP_ALL            9
499
500
-/* Supports the vaddr flag for DMA map and unmap */
501
+/*
502
+ * Supports the vaddr flag for DMA map and unmap. Not supported for mediated
503
+ * devices, so this capability is subject to change as groups are added or
504
+ * removed.
505
+ */
506
#define VFIO_UPDATE_VADDR        10
507
508
/*
509
@@ -XXX,XX +XXX,XX @@ struct vfio_iommu_type1_info_dma_avail {
510
* Map process virtual addresses to IO virtual addresses using the
511
* provided struct vfio_dma_map. Caller sets argsz. READ &/ WRITE required.
512
*
513
- * If flags & VFIO_DMA_MAP_FLAG_VADDR, update the base vaddr for iova, and
514
- * unblock translation of host virtual addresses in the iova range. The vaddr
515
+ * If flags & VFIO_DMA_MAP_FLAG_VADDR, update the base vaddr for iova. The vaddr
516
* must have previously been invalidated with VFIO_DMA_UNMAP_FLAG_VADDR. To
517
* maintain memory consistency within the user application, the updated vaddr
518
* must address the same memory object as originally mapped. Failure to do so
519
@@ -XXX,XX +XXX,XX @@ struct vfio_bitmap {
520
* must be 0. This cannot be combined with the get-dirty-bitmap flag.
521
*
522
* If flags & VFIO_DMA_UNMAP_FLAG_VADDR, do not unmap, but invalidate host
523
- * virtual addresses in the iova range. Tasks that attempt to translate an
524
- * iova's vaddr will block. DMA to already-mapped pages continues. This
525
- * cannot be combined with the get-dirty-bitmap flag.
526
+ * virtual addresses in the iova range. DMA to already-mapped pages continues.
527
+ * Groups may not be added to the container while any addresses are invalid.
528
+ * This cannot be combined with the get-dirty-bitmap flag.
529
*/
530
struct vfio_iommu_type1_dma_unmap {
531
    __u32    argsz;
532
diff --git a/linux-headers/linux/vhost.h b/linux-headers/linux/vhost.h
533
index XXXXXXX..XXXXXXX 100644
534
--- a/linux-headers/linux/vhost.h
535
+++ b/linux-headers/linux/vhost.h
536
@@ -XXX,XX +XXX,XX @@
537
*/
538
#define VHOST_VDPA_SUSPEND        _IO(VHOST_VIRTIO, 0x7D)
539
540
+/* Resume a device so it can resume processing virtqueue requests
541
+ *
542
+ * After the return of this ioctl the device will have restored all the
543
+ * necessary states and it is fully operational to continue processing the
544
+ * virtqueue descriptors.
545
+ */
546
+#define VHOST_VDPA_RESUME        _IO(VHOST_VIRTIO, 0x7E)
547
+
548
#endif
549
--
550
2.39.2
diff view generated by jsdifflib
1
From: Liu Yiding <liuyd.fnst@fujitsu.com>
1
From: Sam Li <faithilikerun@gmail.com>
2
2
3
virtiofsd has introduced killpriv_v2/no_killpriv_v2 for a while. Add
3
This patch extends virtio-blk emulation to handle zoned device commands
4
description of it to docs/helper.
4
by calling the new block layer APIs to perform zoned device I/O on
5
behalf of the guest. It supports Report Zone, four zone oparations (open,
6
close, finish, reset), and Append Zone.
5
7
6
Signed-off-by: Liu Yiding <liuyd.fnst@fujitsu.com>
8
The VIRTIO_BLK_F_ZONED feature bit will only be set if the host does
7
Message-Id: <20220421095151.2231099-1-liuyd.fnst@fujitsu.com>
9
support zoned block devices. Regular block devices(conventional zones)
10
will not be set.
8
11
9
[Small documentation fixes: s/as client supports/as the client supports/
12
The guest os can use blktests, fio to test those commands on zoned devices.
10
and s/. /. /.
13
Furthermore, using zonefs to test zone append write is also supported.
11
--Stefan]
12
14
15
Signed-off-by: Sam Li <faithilikerun@gmail.com>
16
Message-id: 20230407082528.18841-3-faithilikerun@gmail.com
13
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
17
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
14
---
18
---
15
docs/tools/virtiofsd.rst | 5 +++++
19
hw/block/virtio-blk-common.c | 2 +
16
tools/virtiofsd/helper.c | 3 +++
20
hw/block/virtio-blk.c | 389 +++++++++++++++++++++++++++++++++++
17
2 files changed, 8 insertions(+)
21
hw/virtio/virtio-qmp.c | 2 +
22
3 files changed, 393 insertions(+)
18
23
19
diff --git a/docs/tools/virtiofsd.rst b/docs/tools/virtiofsd.rst
24
diff --git a/hw/block/virtio-blk-common.c b/hw/block/virtio-blk-common.c
20
index XXXXXXX..XXXXXXX 100644
25
index XXXXXXX..XXXXXXX 100644
21
--- a/docs/tools/virtiofsd.rst
26
--- a/hw/block/virtio-blk-common.c
22
+++ b/docs/tools/virtiofsd.rst
27
+++ b/hw/block/virtio-blk-common.c
23
@@ -XXX,XX +XXX,XX @@ Options
28
@@ -XXX,XX +XXX,XX @@ static const VirtIOFeature feature_sizes[] = {
24
label. Server will try to set that label on newly created file
29
.end = endof(struct virtio_blk_config, discard_sector_alignment)},
25
atomically wherever possible.
30
{.flags = 1ULL << VIRTIO_BLK_F_WRITE_ZEROES,
26
31
.end = endof(struct virtio_blk_config, write_zeroes_may_unmap)},
27
+ * killpriv_v2|no_killpriv_v2 -
32
+ {.flags = 1ULL << VIRTIO_BLK_F_ZONED,
28
+ Enable/disable ``FUSE_HANDLE_KILLPRIV_V2`` support. KILLPRIV_V2 is enabled
33
+ .end = endof(struct virtio_blk_config, zoned)},
29
+ by default as long as the client supports it. Enabling this option helps
34
{}
30
+ with performance in write path.
35
};
31
+
36
32
.. option:: --socket-path=PATH
37
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
33
34
Listen on vhost-user UNIX domain socket at PATH.
35
diff --git a/tools/virtiofsd/helper.c b/tools/virtiofsd/helper.c
36
index XXXXXXX..XXXXXXX 100644
38
index XXXXXXX..XXXXXXX 100644
37
--- a/tools/virtiofsd/helper.c
39
--- a/hw/block/virtio-blk.c
38
+++ b/tools/virtiofsd/helper.c
40
+++ b/hw/block/virtio-blk.c
39
@@ -XXX,XX +XXX,XX @@ void fuse_cmdline_help(void)
41
@@ -XXX,XX +XXX,XX @@
40
" -o announce_submounts Announce sub-mount points to the guest\n"
42
#include "qemu/module.h"
41
" -o posix_acl/no_posix_acl Enable/Disable posix_acl. (default: disabled)\n"
43
#include "qemu/error-report.h"
42
" -o security_label/no_security_label Enable/Disable security label. (default: disabled)\n"
44
#include "qemu/main-loop.h"
43
+ " -o killpriv_v2/no_killpriv_v2\n"
45
+#include "block/block_int.h"
44
+ " Enable/Disable FUSE_HANDLE_KILLPRIV_V2.\n"
46
#include "trace.h"
45
+ " (default: enabled as long as client supports it)\n"
47
#include "hw/block/block.h"
46
);
48
#include "hw/qdev-properties.h"
49
@@ -XXX,XX +XXX,XX @@ err:
50
return err_status;
47
}
51
}
48
52
53
+typedef struct ZoneCmdData {
54
+ VirtIOBlockReq *req;
55
+ struct iovec *in_iov;
56
+ unsigned in_num;
57
+ union {
58
+ struct {
59
+ unsigned int nr_zones;
60
+ BlockZoneDescriptor *zones;
61
+ } zone_report_data;
62
+ struct {
63
+ int64_t offset;
64
+ } zone_append_data;
65
+ };
66
+} ZoneCmdData;
67
+
68
+/*
69
+ * check zoned_request: error checking before issuing requests. If all checks
70
+ * passed, return true.
71
+ * append: true if only zone append requests issued.
72
+ */
73
+static bool check_zoned_request(VirtIOBlock *s, int64_t offset, int64_t len,
74
+ bool append, uint8_t *status) {
75
+ BlockDriverState *bs = blk_bs(s->blk);
76
+ int index;
77
+
78
+ if (!virtio_has_feature(s->host_features, VIRTIO_BLK_F_ZONED)) {
79
+ *status = VIRTIO_BLK_S_UNSUPP;
80
+ return false;
81
+ }
82
+
83
+ if (offset < 0 || len < 0 || len > (bs->total_sectors << BDRV_SECTOR_BITS)
84
+ || offset > (bs->total_sectors << BDRV_SECTOR_BITS) - len) {
85
+ *status = VIRTIO_BLK_S_ZONE_INVALID_CMD;
86
+ return false;
87
+ }
88
+
89
+ if (append) {
90
+ if (bs->bl.write_granularity) {
91
+ if ((offset % bs->bl.write_granularity) != 0) {
92
+ *status = VIRTIO_BLK_S_ZONE_UNALIGNED_WP;
93
+ return false;
94
+ }
95
+ }
96
+
97
+ index = offset / bs->bl.zone_size;
98
+ if (BDRV_ZT_IS_CONV(bs->wps->wp[index])) {
99
+ *status = VIRTIO_BLK_S_ZONE_INVALID_CMD;
100
+ return false;
101
+ }
102
+
103
+ if (len / 512 > bs->bl.max_append_sectors) {
104
+ if (bs->bl.max_append_sectors == 0) {
105
+ *status = VIRTIO_BLK_S_UNSUPP;
106
+ } else {
107
+ *status = VIRTIO_BLK_S_ZONE_INVALID_CMD;
108
+ }
109
+ return false;
110
+ }
111
+ }
112
+ return true;
113
+}
114
+
115
+static void virtio_blk_zone_report_complete(void *opaque, int ret)
116
+{
117
+ ZoneCmdData *data = opaque;
118
+ VirtIOBlockReq *req = data->req;
119
+ VirtIOBlock *s = req->dev;
120
+ VirtIODevice *vdev = VIRTIO_DEVICE(req->dev);
121
+ struct iovec *in_iov = data->in_iov;
122
+ unsigned in_num = data->in_num;
123
+ int64_t zrp_size, n, j = 0;
124
+ int64_t nz = data->zone_report_data.nr_zones;
125
+ int8_t err_status = VIRTIO_BLK_S_OK;
126
+
127
+ if (ret) {
128
+ err_status = VIRTIO_BLK_S_ZONE_INVALID_CMD;
129
+ goto out;
130
+ }
131
+
132
+ struct virtio_blk_zone_report zrp_hdr = (struct virtio_blk_zone_report) {
133
+ .nr_zones = cpu_to_le64(nz),
134
+ };
135
+ zrp_size = sizeof(struct virtio_blk_zone_report)
136
+ + sizeof(struct virtio_blk_zone_descriptor) * nz;
137
+ n = iov_from_buf(in_iov, in_num, 0, &zrp_hdr, sizeof(zrp_hdr));
138
+ if (n != sizeof(zrp_hdr)) {
139
+ virtio_error(vdev, "Driver provided input buffer that is too small!");
140
+ err_status = VIRTIO_BLK_S_ZONE_INVALID_CMD;
141
+ goto out;
142
+ }
143
+
144
+ for (size_t i = sizeof(zrp_hdr); i < zrp_size;
145
+ i += sizeof(struct virtio_blk_zone_descriptor), ++j) {
146
+ struct virtio_blk_zone_descriptor desc =
147
+ (struct virtio_blk_zone_descriptor) {
148
+ .z_start = cpu_to_le64(data->zone_report_data.zones[j].start
149
+ >> BDRV_SECTOR_BITS),
150
+ .z_cap = cpu_to_le64(data->zone_report_data.zones[j].cap
151
+ >> BDRV_SECTOR_BITS),
152
+ .z_wp = cpu_to_le64(data->zone_report_data.zones[j].wp
153
+ >> BDRV_SECTOR_BITS),
154
+ };
155
+
156
+ switch (data->zone_report_data.zones[j].type) {
157
+ case BLK_ZT_CONV:
158
+ desc.z_type = VIRTIO_BLK_ZT_CONV;
159
+ break;
160
+ case BLK_ZT_SWR:
161
+ desc.z_type = VIRTIO_BLK_ZT_SWR;
162
+ break;
163
+ case BLK_ZT_SWP:
164
+ desc.z_type = VIRTIO_BLK_ZT_SWP;
165
+ break;
166
+ default:
167
+ g_assert_not_reached();
168
+ }
169
+
170
+ switch (data->zone_report_data.zones[j].state) {
171
+ case BLK_ZS_RDONLY:
172
+ desc.z_state = VIRTIO_BLK_ZS_RDONLY;
173
+ break;
174
+ case BLK_ZS_OFFLINE:
175
+ desc.z_state = VIRTIO_BLK_ZS_OFFLINE;
176
+ break;
177
+ case BLK_ZS_EMPTY:
178
+ desc.z_state = VIRTIO_BLK_ZS_EMPTY;
179
+ break;
180
+ case BLK_ZS_CLOSED:
181
+ desc.z_state = VIRTIO_BLK_ZS_CLOSED;
182
+ break;
183
+ case BLK_ZS_FULL:
184
+ desc.z_state = VIRTIO_BLK_ZS_FULL;
185
+ break;
186
+ case BLK_ZS_EOPEN:
187
+ desc.z_state = VIRTIO_BLK_ZS_EOPEN;
188
+ break;
189
+ case BLK_ZS_IOPEN:
190
+ desc.z_state = VIRTIO_BLK_ZS_IOPEN;
191
+ break;
192
+ case BLK_ZS_NOT_WP:
193
+ desc.z_state = VIRTIO_BLK_ZS_NOT_WP;
194
+ break;
195
+ default:
196
+ g_assert_not_reached();
197
+ }
198
+
199
+ /* TODO: it takes O(n^2) time complexity. Optimizations required. */
200
+ n = iov_from_buf(in_iov, in_num, i, &desc, sizeof(desc));
201
+ if (n != sizeof(desc)) {
202
+ virtio_error(vdev, "Driver provided input buffer "
203
+ "for descriptors that is too small!");
204
+ err_status = VIRTIO_BLK_S_ZONE_INVALID_CMD;
205
+ }
206
+ }
207
+
208
+out:
209
+ aio_context_acquire(blk_get_aio_context(s->conf.conf.blk));
210
+ virtio_blk_req_complete(req, err_status);
211
+ virtio_blk_free_request(req);
212
+ aio_context_release(blk_get_aio_context(s->conf.conf.blk));
213
+ g_free(data->zone_report_data.zones);
214
+ g_free(data);
215
+}
216
+
217
+static void virtio_blk_handle_zone_report(VirtIOBlockReq *req,
218
+ struct iovec *in_iov,
219
+ unsigned in_num)
220
+{
221
+ VirtIOBlock *s = req->dev;
222
+ VirtIODevice *vdev = VIRTIO_DEVICE(s);
223
+ unsigned int nr_zones;
224
+ ZoneCmdData *data;
225
+ int64_t zone_size, offset;
226
+ uint8_t err_status;
227
+
228
+ if (req->in_len < sizeof(struct virtio_blk_inhdr) +
229
+ sizeof(struct virtio_blk_zone_report) +
230
+ sizeof(struct virtio_blk_zone_descriptor)) {
231
+ virtio_error(vdev, "in buffer too small for zone report");
232
+ return;
233
+ }
234
+
235
+ /* start byte offset of the zone report */
236
+ offset = virtio_ldq_p(vdev, &req->out.sector) << BDRV_SECTOR_BITS;
237
+ if (!check_zoned_request(s, offset, 0, false, &err_status)) {
238
+ goto out;
239
+ }
240
+ nr_zones = (req->in_len - sizeof(struct virtio_blk_inhdr) -
241
+ sizeof(struct virtio_blk_zone_report)) /
242
+ sizeof(struct virtio_blk_zone_descriptor);
243
+
244
+ zone_size = sizeof(BlockZoneDescriptor) * nr_zones;
245
+ data = g_malloc(sizeof(ZoneCmdData));
246
+ data->req = req;
247
+ data->in_iov = in_iov;
248
+ data->in_num = in_num;
249
+ data->zone_report_data.nr_zones = nr_zones;
250
+ data->zone_report_data.zones = g_malloc(zone_size),
251
+
252
+ blk_aio_zone_report(s->blk, offset, &data->zone_report_data.nr_zones,
253
+ data->zone_report_data.zones,
254
+ virtio_blk_zone_report_complete, data);
255
+ return;
256
+out:
257
+ virtio_blk_req_complete(req, err_status);
258
+ virtio_blk_free_request(req);
259
+}
260
+
261
+static void virtio_blk_zone_mgmt_complete(void *opaque, int ret)
262
+{
263
+ VirtIOBlockReq *req = opaque;
264
+ VirtIOBlock *s = req->dev;
265
+ int8_t err_status = VIRTIO_BLK_S_OK;
266
+
267
+ if (ret) {
268
+ err_status = VIRTIO_BLK_S_ZONE_INVALID_CMD;
269
+ }
270
+
271
+ aio_context_acquire(blk_get_aio_context(s->conf.conf.blk));
272
+ virtio_blk_req_complete(req, err_status);
273
+ virtio_blk_free_request(req);
274
+ aio_context_release(blk_get_aio_context(s->conf.conf.blk));
275
+}
276
+
277
+static int virtio_blk_handle_zone_mgmt(VirtIOBlockReq *req, BlockZoneOp op)
278
+{
279
+ VirtIOBlock *s = req->dev;
280
+ VirtIODevice *vdev = VIRTIO_DEVICE(s);
281
+ BlockDriverState *bs = blk_bs(s->blk);
282
+ int64_t offset = virtio_ldq_p(vdev, &req->out.sector) << BDRV_SECTOR_BITS;
283
+ uint64_t len;
284
+ uint64_t capacity = bs->total_sectors << BDRV_SECTOR_BITS;
285
+ uint8_t err_status = VIRTIO_BLK_S_OK;
286
+
287
+ uint32_t type = virtio_ldl_p(vdev, &req->out.type);
288
+ if (type == VIRTIO_BLK_T_ZONE_RESET_ALL) {
289
+ /* Entire drive capacity */
290
+ offset = 0;
291
+ len = capacity;
292
+ } else {
293
+ if (bs->bl.zone_size > capacity - offset) {
294
+ /* The zoned device allows the last smaller zone. */
295
+ len = capacity - bs->bl.zone_size * (bs->bl.nr_zones - 1);
296
+ } else {
297
+ len = bs->bl.zone_size;
298
+ }
299
+ }
300
+
301
+ if (!check_zoned_request(s, offset, len, false, &err_status)) {
302
+ goto out;
303
+ }
304
+
305
+ blk_aio_zone_mgmt(s->blk, op, offset, len,
306
+ virtio_blk_zone_mgmt_complete, req);
307
+
308
+ return 0;
309
+out:
310
+ virtio_blk_req_complete(req, err_status);
311
+ virtio_blk_free_request(req);
312
+ return err_status;
313
+}
314
+
315
+static void virtio_blk_zone_append_complete(void *opaque, int ret)
316
+{
317
+ ZoneCmdData *data = opaque;
318
+ VirtIOBlockReq *req = data->req;
319
+ VirtIOBlock *s = req->dev;
320
+ VirtIODevice *vdev = VIRTIO_DEVICE(req->dev);
321
+ int64_t append_sector, n;
322
+ uint8_t err_status = VIRTIO_BLK_S_OK;
323
+
324
+ if (ret) {
325
+ err_status = VIRTIO_BLK_S_ZONE_INVALID_CMD;
326
+ goto out;
327
+ }
328
+
329
+ virtio_stq_p(vdev, &append_sector,
330
+ data->zone_append_data.offset >> BDRV_SECTOR_BITS);
331
+ n = iov_from_buf(data->in_iov, data->in_num, 0, &append_sector,
332
+ sizeof(append_sector));
333
+ if (n != sizeof(append_sector)) {
334
+ virtio_error(vdev, "Driver provided input buffer less than size of "
335
+ "append_sector");
336
+ err_status = VIRTIO_BLK_S_ZONE_INVALID_CMD;
337
+ goto out;
338
+ }
339
+
340
+out:
341
+ aio_context_acquire(blk_get_aio_context(s->conf.conf.blk));
342
+ virtio_blk_req_complete(req, err_status);
343
+ virtio_blk_free_request(req);
344
+ aio_context_release(blk_get_aio_context(s->conf.conf.blk));
345
+ g_free(data);
346
+}
347
+
348
+static int virtio_blk_handle_zone_append(VirtIOBlockReq *req,
349
+ struct iovec *out_iov,
350
+ struct iovec *in_iov,
351
+ uint64_t out_num,
352
+ unsigned in_num) {
353
+ VirtIOBlock *s = req->dev;
354
+ VirtIODevice *vdev = VIRTIO_DEVICE(s);
355
+ uint8_t err_status = VIRTIO_BLK_S_OK;
356
+
357
+ int64_t offset = virtio_ldq_p(vdev, &req->out.sector) << BDRV_SECTOR_BITS;
358
+ int64_t len = iov_size(out_iov, out_num);
359
+
360
+ if (!check_zoned_request(s, offset, len, true, &err_status)) {
361
+ goto out;
362
+ }
363
+
364
+ ZoneCmdData *data = g_malloc(sizeof(ZoneCmdData));
365
+ data->req = req;
366
+ data->in_iov = in_iov;
367
+ data->in_num = in_num;
368
+ data->zone_append_data.offset = offset;
369
+ qemu_iovec_init_external(&req->qiov, out_iov, out_num);
370
+ blk_aio_zone_append(s->blk, &data->zone_append_data.offset, &req->qiov, 0,
371
+ virtio_blk_zone_append_complete, data);
372
+ return 0;
373
+
374
+out:
375
+ aio_context_acquire(blk_get_aio_context(s->conf.conf.blk));
376
+ virtio_blk_req_complete(req, err_status);
377
+ virtio_blk_free_request(req);
378
+ aio_context_release(blk_get_aio_context(s->conf.conf.blk));
379
+ return err_status;
380
+}
381
+
382
static int virtio_blk_handle_request(VirtIOBlockReq *req, MultiReqBuffer *mrb)
383
{
384
uint32_t type;
385
@@ -XXX,XX +XXX,XX @@ static int virtio_blk_handle_request(VirtIOBlockReq *req, MultiReqBuffer *mrb)
386
case VIRTIO_BLK_T_FLUSH:
387
virtio_blk_handle_flush(req, mrb);
388
break;
389
+ case VIRTIO_BLK_T_ZONE_REPORT:
390
+ virtio_blk_handle_zone_report(req, in_iov, in_num);
391
+ break;
392
+ case VIRTIO_BLK_T_ZONE_OPEN:
393
+ virtio_blk_handle_zone_mgmt(req, BLK_ZO_OPEN);
394
+ break;
395
+ case VIRTIO_BLK_T_ZONE_CLOSE:
396
+ virtio_blk_handle_zone_mgmt(req, BLK_ZO_CLOSE);
397
+ break;
398
+ case VIRTIO_BLK_T_ZONE_FINISH:
399
+ virtio_blk_handle_zone_mgmt(req, BLK_ZO_FINISH);
400
+ break;
401
+ case VIRTIO_BLK_T_ZONE_RESET:
402
+ virtio_blk_handle_zone_mgmt(req, BLK_ZO_RESET);
403
+ break;
404
+ case VIRTIO_BLK_T_ZONE_RESET_ALL:
405
+ virtio_blk_handle_zone_mgmt(req, BLK_ZO_RESET);
406
+ break;
407
case VIRTIO_BLK_T_SCSI_CMD:
408
virtio_blk_handle_scsi(req);
409
break;
410
@@ -XXX,XX +XXX,XX @@ static int virtio_blk_handle_request(VirtIOBlockReq *req, MultiReqBuffer *mrb)
411
virtio_blk_free_request(req);
412
break;
413
}
414
+ case VIRTIO_BLK_T_ZONE_APPEND & ~VIRTIO_BLK_T_OUT:
415
+ /*
416
+ * Passing out_iov/out_num and in_iov/in_num is not safe
417
+ * to access req->elem.out_sg directly because it may be
418
+ * modified by virtio_blk_handle_request().
419
+ */
420
+ virtio_blk_handle_zone_append(req, out_iov, in_iov, out_num, in_num);
421
+ break;
422
/*
423
* VIRTIO_BLK_T_DISCARD and VIRTIO_BLK_T_WRITE_ZEROES are defined with
424
* VIRTIO_BLK_T_OUT flag set. We masked this flag in the switch statement,
425
@@ -XXX,XX +XXX,XX @@ static void virtio_blk_update_config(VirtIODevice *vdev, uint8_t *config)
426
{
427
VirtIOBlock *s = VIRTIO_BLK(vdev);
428
BlockConf *conf = &s->conf.conf;
429
+ BlockDriverState *bs = blk_bs(s->blk);
430
struct virtio_blk_config blkcfg;
431
uint64_t capacity;
432
int64_t length;
433
@@ -XXX,XX +XXX,XX @@ static void virtio_blk_update_config(VirtIODevice *vdev, uint8_t *config)
434
blkcfg.write_zeroes_may_unmap = 1;
435
virtio_stl_p(vdev, &blkcfg.max_write_zeroes_seg, 1);
436
}
437
+ if (bs->bl.zoned != BLK_Z_NONE) {
438
+ switch (bs->bl.zoned) {
439
+ case BLK_Z_HM:
440
+ blkcfg.zoned.model = VIRTIO_BLK_Z_HM;
441
+ break;
442
+ case BLK_Z_HA:
443
+ blkcfg.zoned.model = VIRTIO_BLK_Z_HA;
444
+ break;
445
+ default:
446
+ g_assert_not_reached();
447
+ }
448
+
449
+ virtio_stl_p(vdev, &blkcfg.zoned.zone_sectors,
450
+ bs->bl.zone_size / 512);
451
+ virtio_stl_p(vdev, &blkcfg.zoned.max_active_zones,
452
+ bs->bl.max_active_zones);
453
+ virtio_stl_p(vdev, &blkcfg.zoned.max_open_zones,
454
+ bs->bl.max_open_zones);
455
+ virtio_stl_p(vdev, &blkcfg.zoned.write_granularity, blk_size);
456
+ virtio_stl_p(vdev, &blkcfg.zoned.max_append_sectors,
457
+ bs->bl.max_append_sectors);
458
+ } else {
459
+ blkcfg.zoned.model = VIRTIO_BLK_Z_NONE;
460
+ }
461
memcpy(config, &blkcfg, s->config_size);
462
}
463
464
@@ -XXX,XX +XXX,XX @@ static void virtio_blk_device_realize(DeviceState *dev, Error **errp)
465
VirtIODevice *vdev = VIRTIO_DEVICE(dev);
466
VirtIOBlock *s = VIRTIO_BLK(dev);
467
VirtIOBlkConf *conf = &s->conf;
468
+ BlockDriverState *bs = blk_bs(conf->conf.blk);
469
Error *err = NULL;
470
unsigned i;
471
472
@@ -XXX,XX +XXX,XX @@ static void virtio_blk_device_realize(DeviceState *dev, Error **errp)
473
return;
474
}
475
476
+ if (bs->bl.zoned != BLK_Z_NONE) {
477
+ virtio_add_feature(&s->host_features, VIRTIO_BLK_F_ZONED);
478
+ if (bs->bl.zoned == BLK_Z_HM) {
479
+ virtio_clear_feature(&s->host_features, VIRTIO_BLK_F_DISCARD);
480
+ }
481
+ }
482
+
483
if (virtio_has_feature(s->host_features, VIRTIO_BLK_F_DISCARD) &&
484
(!conf->max_discard_sectors ||
485
conf->max_discard_sectors > BDRV_REQUEST_MAX_SECTORS)) {
486
diff --git a/hw/virtio/virtio-qmp.c b/hw/virtio/virtio-qmp.c
487
index XXXXXXX..XXXXXXX 100644
488
--- a/hw/virtio/virtio-qmp.c
489
+++ b/hw/virtio/virtio-qmp.c
490
@@ -XXX,XX +XXX,XX @@ static const qmp_virtio_feature_map_t virtio_blk_feature_map[] = {
491
"VIRTIO_BLK_F_DISCARD: Discard command supported"),
492
FEATURE_ENTRY(VIRTIO_BLK_F_WRITE_ZEROES, \
493
"VIRTIO_BLK_F_WRITE_ZEROES: Write zeroes command supported"),
494
+ FEATURE_ENTRY(VIRTIO_BLK_F_ZONED, \
495
+ "VIRTIO_BLK_F_ZONED: Zoned block devices"),
496
#ifndef VIRTIO_BLK_NO_LEGACY
497
FEATURE_ENTRY(VIRTIO_BLK_F_BARRIER, \
498
"VIRTIO_BLK_F_BARRIER: Request barriers supported"),
49
--
499
--
50
2.35.1
500
2.39.2
diff view generated by jsdifflib
1
From: Sakshi Kaushik <sakshikaushik717@gmail.com>
1
From: Sam Li <faithilikerun@gmail.com>
2
2
3
Signed-off-by: Sakshi Kaushik <sakshikaushik717@gmail.com>
3
Taking account of the new zone append write operation for zoned devices,
4
Message-id: 20220406162410.8536-1-sakshikaushik717@gmail.com
4
BLOCK_ACCT_ZONE_APPEND enum is introduced as other I/O request type (read,
5
write, flush).
5
6
6
[Name the iSCSI URL long option --iscsi-uri instead of --iscsi_uri for
7
Signed-off-by: Sam Li <faithilikerun@gmail.com>
7
consistency, fix --fd which was rejected due to an outdated
8
Message-id: 20230407082528.18841-4-faithilikerun@gmail.com
8
--socket-path check, and add missing entries[] terminator.
9
--Stefan]
10
11
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
9
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
12
---
10
---
13
contrib/vhost-user-scsi/vhost-user-scsi.c | 77 +++++++++++++++--------
11
qapi/block-core.json | 68 ++++++++++++++++++++++++++++++++------
14
1 file changed, 52 insertions(+), 25 deletions(-)
12
qapi/block.json | 4 +++
13
include/block/accounting.h | 1 +
14
block/qapi-sysemu.c | 11 ++++++
15
block/qapi.c | 18 ++++++++++
16
hw/block/virtio-blk.c | 4 +++
17
6 files changed, 95 insertions(+), 11 deletions(-)
15
18
16
diff --git a/contrib/vhost-user-scsi/vhost-user-scsi.c b/contrib/vhost-user-scsi/vhost-user-scsi.c
19
diff --git a/qapi/block-core.json b/qapi/block-core.json
17
index XXXXXXX..XXXXXXX 100644
20
index XXXXXXX..XXXXXXX 100644
18
--- a/contrib/vhost-user-scsi/vhost-user-scsi.c
21
--- a/qapi/block-core.json
19
+++ b/contrib/vhost-user-scsi/vhost-user-scsi.c
22
+++ b/qapi/block-core.json
20
@@ -XXX,XX +XXX,XX @@ fail:
23
@@ -XXX,XX +XXX,XX @@
21
24
# @min_wr_latency_ns: Minimum latency of write operations in the
22
/** vhost-user-scsi **/
25
# defined interval, in nanoseconds.
23
26
#
24
+static int opt_fdnum = -1;
27
+# @min_zone_append_latency_ns: Minimum latency of zone append operations
25
+static char *opt_socket_path;
28
+# in the defined interval, in nanoseconds
26
+static gboolean opt_print_caps;
29
+# (since 8.1)
27
+static char *iscsi_uri;
30
+#
28
+
31
# @min_flush_latency_ns: Minimum latency of flush operations in the
29
+static GOptionEntry entries[] = {
32
# defined interval, in nanoseconds.
30
+ { "print-capabilities", 'c', 0, G_OPTION_ARG_NONE, &opt_print_caps,
33
#
31
+ "Print capabilities", NULL },
34
@@ -XXX,XX +XXX,XX @@
32
+ { "fd", 'f', 0, G_OPTION_ARG_INT, &opt_fdnum,
35
# @max_wr_latency_ns: Maximum latency of write operations in the
33
+ "Use inherited fd socket", "FDNUM" },
36
# defined interval, in nanoseconds.
34
+ { "iscsi-uri", 'i', 0, G_OPTION_ARG_FILENAME, &iscsi_uri,
37
#
35
+ "iSCSI URI to connect to", "FDNUM" },
38
+# @max_zone_append_latency_ns: Maximum latency of zone append operations
36
+ { "socket-path", 's', 0, G_OPTION_ARG_FILENAME, &opt_socket_path,
39
+# in the defined interval, in nanoseconds
37
+ "Use UNIX socket path", "PATH" },
40
+# (since 8.1)
38
+ { NULL, }
41
+#
39
+};
42
# @max_flush_latency_ns: Maximum latency of flush operations in the
40
+
43
# defined interval, in nanoseconds.
41
int main(int argc, char **argv)
44
#
45
@@ -XXX,XX +XXX,XX @@
46
# @avg_wr_latency_ns: Average latency of write operations in the
47
# defined interval, in nanoseconds.
48
#
49
+# @avg_zone_append_latency_ns: Average latency of zone append operations
50
+# in the defined interval, in nanoseconds
51
+# (since 8.1)
52
+#
53
# @avg_flush_latency_ns: Average latency of flush operations in the
54
# defined interval, in nanoseconds.
55
#
56
@@ -XXX,XX +XXX,XX @@
57
# @avg_wr_queue_depth: Average number of pending write operations
58
# in the defined interval.
59
#
60
+# @avg_zone_append_queue_depth: Average number of pending zone append
61
+# operations in the defined interval
62
+# (since 8.1).
63
+#
64
# Since: 2.5
65
##
66
{ 'struct': 'BlockDeviceTimedStats',
67
'data': { 'interval_length': 'int', 'min_rd_latency_ns': 'int',
68
'max_rd_latency_ns': 'int', 'avg_rd_latency_ns': 'int',
69
'min_wr_latency_ns': 'int', 'max_wr_latency_ns': 'int',
70
- 'avg_wr_latency_ns': 'int', 'min_flush_latency_ns': 'int',
71
- 'max_flush_latency_ns': 'int', 'avg_flush_latency_ns': 'int',
72
- 'avg_rd_queue_depth': 'number', 'avg_wr_queue_depth': 'number' } }
73
+ 'avg_wr_latency_ns': 'int', 'min_zone_append_latency_ns': 'int',
74
+ 'max_zone_append_latency_ns': 'int',
75
+ 'avg_zone_append_latency_ns': 'int',
76
+ 'min_flush_latency_ns': 'int', 'max_flush_latency_ns': 'int',
77
+ 'avg_flush_latency_ns': 'int', 'avg_rd_queue_depth': 'number',
78
+ 'avg_wr_queue_depth': 'number',
79
+ 'avg_zone_append_queue_depth': 'number' } }
80
81
##
82
# @BlockDeviceStats:
83
@@ -XXX,XX +XXX,XX @@
84
#
85
# @wr_bytes: The number of bytes written by the device.
86
#
87
+# @zone_append_bytes: The number of bytes appended by the zoned devices
88
+# (since 8.1)
89
+#
90
# @unmap_bytes: The number of bytes unmapped by the device (Since 4.2)
91
#
92
# @rd_operations: The number of read operations performed by the device.
93
#
94
# @wr_operations: The number of write operations performed by the device.
95
#
96
+# @zone_append_operations: The number of zone append operations performed
97
+# by the zoned devices (since 8.1)
98
+#
99
# @flush_operations: The number of cache flush operations performed by the
100
# device (since 0.15)
101
#
102
@@ -XXX,XX +XXX,XX @@
103
#
104
# @wr_total_time_ns: Total time spent on writes in nanoseconds (since 0.15).
105
#
106
+# @zone_append_total_time_ns: Total time spent on zone append writes
107
+# in nanoseconds (since 8.1)
108
+#
109
# @flush_total_time_ns: Total time spent on cache flushes in nanoseconds
110
# (since 0.15).
111
#
112
@@ -XXX,XX +XXX,XX @@
113
# @wr_merged: Number of write requests that have been merged into another
114
# request (Since 2.3).
115
#
116
+# @zone_append_merged: Number of zone append requests that have been merged
117
+# into another request (since 8.1)
118
+#
119
# @unmap_merged: Number of unmap requests that have been merged into another
120
# request (Since 4.2)
121
#
122
@@ -XXX,XX +XXX,XX @@
123
# @failed_wr_operations: The number of failed write operations
124
# performed by the device (Since 2.5)
125
#
126
+# @failed_zone_append_operations: The number of failed zone append write
127
+# operations performed by the zoned devices
128
+# (since 8.1)
129
+#
130
# @failed_flush_operations: The number of failed flush operations
131
# performed by the device (Since 2.5)
132
#
133
@@ -XXX,XX +XXX,XX @@
134
# @invalid_wr_operations: The number of invalid write operations
135
# performed by the device (Since 2.5)
136
#
137
+# @invalid_zone_append_operations: The number of invalid zone append operations
138
+# performed by the zoned device (since 8.1)
139
+#
140
# @invalid_flush_operations: The number of invalid flush operations
141
# performed by the device (Since 2.5)
142
#
143
@@ -XXX,XX +XXX,XX @@
144
#
145
# @wr_latency_histogram: @BlockLatencyHistogramInfo. (Since 4.0)
146
#
147
+# @zone_append_latency_histogram: @BlockLatencyHistogramInfo. (since 8.1)
148
+#
149
# @flush_latency_histogram: @BlockLatencyHistogramInfo. (Since 4.0)
150
#
151
# Since: 0.14
152
##
153
{ 'struct': 'BlockDeviceStats',
154
- 'data': {'rd_bytes': 'int', 'wr_bytes': 'int', 'unmap_bytes' : 'int',
155
- 'rd_operations': 'int', 'wr_operations': 'int',
156
+ 'data': {'rd_bytes': 'int', 'wr_bytes': 'int', 'zone_append_bytes': 'int',
157
+ 'unmap_bytes' : 'int', 'rd_operations': 'int',
158
+ 'wr_operations': 'int', 'zone_append_operations': 'int',
159
'flush_operations': 'int', 'unmap_operations': 'int',
160
'rd_total_time_ns': 'int', 'wr_total_time_ns': 'int',
161
- 'flush_total_time_ns': 'int', 'unmap_total_time_ns': 'int',
162
- 'wr_highest_offset': 'int',
163
- 'rd_merged': 'int', 'wr_merged': 'int', 'unmap_merged': 'int',
164
- '*idle_time_ns': 'int',
165
+ 'zone_append_total_time_ns': 'int', 'flush_total_time_ns': 'int',
166
+ 'unmap_total_time_ns': 'int', 'wr_highest_offset': 'int',
167
+ 'rd_merged': 'int', 'wr_merged': 'int', 'zone_append_merged': 'int',
168
+ 'unmap_merged': 'int', '*idle_time_ns': 'int',
169
'failed_rd_operations': 'int', 'failed_wr_operations': 'int',
170
- 'failed_flush_operations': 'int', 'failed_unmap_operations': 'int',
171
- 'invalid_rd_operations': 'int', 'invalid_wr_operations': 'int',
172
+ 'failed_zone_append_operations': 'int',
173
+ 'failed_flush_operations': 'int',
174
+ 'failed_unmap_operations': 'int', 'invalid_rd_operations': 'int',
175
+ 'invalid_wr_operations': 'int',
176
+ 'invalid_zone_append_operations': 'int',
177
'invalid_flush_operations': 'int', 'invalid_unmap_operations': 'int',
178
'account_invalid': 'bool', 'account_failed': 'bool',
179
'timed_stats': ['BlockDeviceTimedStats'],
180
'*rd_latency_histogram': 'BlockLatencyHistogramInfo',
181
'*wr_latency_histogram': 'BlockLatencyHistogramInfo',
182
+ '*zone_append_latency_histogram': 'BlockLatencyHistogramInfo',
183
'*flush_latency_histogram': 'BlockLatencyHistogramInfo' } }
184
185
##
186
diff --git a/qapi/block.json b/qapi/block.json
187
index XXXXXXX..XXXXXXX 100644
188
--- a/qapi/block.json
189
+++ b/qapi/block.json
190
@@ -XXX,XX +XXX,XX @@
191
# @boundaries-write: list of interval boundary values for write latency
192
# histogram.
193
#
194
+# @boundaries-zap: list of interval boundary values for zone append write
195
+# latency histogram.
196
+#
197
# @boundaries-flush: list of interval boundary values for flush latency
198
# histogram.
199
#
200
@@ -XXX,XX +XXX,XX @@
201
'*boundaries': ['uint64'],
202
'*boundaries-read': ['uint64'],
203
'*boundaries-write': ['uint64'],
204
+ '*boundaries-zap': ['uint64'],
205
'*boundaries-flush': ['uint64'] },
206
'allow-preconfig': true }
207
diff --git a/include/block/accounting.h b/include/block/accounting.h
208
index XXXXXXX..XXXXXXX 100644
209
--- a/include/block/accounting.h
210
+++ b/include/block/accounting.h
211
@@ -XXX,XX +XXX,XX @@ enum BlockAcctType {
212
BLOCK_ACCT_READ,
213
BLOCK_ACCT_WRITE,
214
BLOCK_ACCT_FLUSH,
215
+ BLOCK_ACCT_ZONE_APPEND,
216
BLOCK_ACCT_UNMAP,
217
BLOCK_MAX_IOTYPE,
218
};
219
diff --git a/block/qapi-sysemu.c b/block/qapi-sysemu.c
220
index XXXXXXX..XXXXXXX 100644
221
--- a/block/qapi-sysemu.c
222
+++ b/block/qapi-sysemu.c
223
@@ -XXX,XX +XXX,XX @@ void qmp_block_latency_histogram_set(
224
bool has_boundaries, uint64List *boundaries,
225
bool has_boundaries_read, uint64List *boundaries_read,
226
bool has_boundaries_write, uint64List *boundaries_write,
227
+ bool has_boundaries_append, uint64List *boundaries_append,
228
bool has_boundaries_flush, uint64List *boundaries_flush,
229
Error **errp)
42
{
230
{
43
VusDev *vdev_scsi = NULL;
231
@@ -XXX,XX +XXX,XX @@ void qmp_block_latency_histogram_set(
44
- char *unix_fn = NULL;
232
}
45
- char *iscsi_uri = NULL;
233
}
46
- int lsock = -1, csock = -1, opt, err = EXIT_SUCCESS;
234
47
+ int lsock = -1, csock = -1, err = EXIT_SUCCESS;
235
+ if (has_boundaries || has_boundaries_append) {
48
236
+ ret = block_latency_histogram_set(
49
- while ((opt = getopt(argc, argv, "u:i:")) != -1) {
237
+ stats, BLOCK_ACCT_ZONE_APPEND,
50
- switch (opt) {
238
+ has_boundaries_append ? boundaries_append : boundaries);
51
- case 'h':
239
+ if (ret) {
52
- goto help;
240
+ error_setg(errp, "Device '%s' set append write boundaries fail", id);
53
- case 'u':
241
+ return;
54
- unix_fn = g_strdup(optarg);
242
+ }
55
- break;
56
- case 'i':
57
- iscsi_uri = g_strdup(optarg);
58
- break;
59
- default:
60
- goto help;
61
- }
62
+ GError *error = NULL;
63
+ GOptionContext *context;
64
+
65
+ context = g_option_context_new(NULL);
66
+ g_option_context_add_main_entries(context, entries, NULL);
67
+ if (!g_option_context_parse(context, &argc, &argv, &error)) {
68
+ g_printerr("Option parsing failed: %s\n", error->message);
69
+ exit(EXIT_FAILURE);
70
+ }
243
+ }
71
+
244
+
72
+ if (opt_print_caps) {
245
if (has_boundaries || has_boundaries_flush) {
73
+ g_print("{\n");
246
ret = block_latency_histogram_set(
74
+ g_print(" \"type\": \"scsi\"\n");
247
stats, BLOCK_ACCT_FLUSH,
75
+ g_print("}\n");
248
diff --git a/block/qapi.c b/block/qapi.c
76
+ goto out;
249
index XXXXXXX..XXXXXXX 100644
250
--- a/block/qapi.c
251
+++ b/block/qapi.c
252
@@ -XXX,XX +XXX,XX @@ static void bdrv_query_blk_stats(BlockDeviceStats *ds, BlockBackend *blk)
253
254
ds->rd_bytes = stats->nr_bytes[BLOCK_ACCT_READ];
255
ds->wr_bytes = stats->nr_bytes[BLOCK_ACCT_WRITE];
256
+ ds->zone_append_bytes = stats->nr_bytes[BLOCK_ACCT_ZONE_APPEND];
257
ds->unmap_bytes = stats->nr_bytes[BLOCK_ACCT_UNMAP];
258
ds->rd_operations = stats->nr_ops[BLOCK_ACCT_READ];
259
ds->wr_operations = stats->nr_ops[BLOCK_ACCT_WRITE];
260
+ ds->zone_append_operations = stats->nr_ops[BLOCK_ACCT_ZONE_APPEND];
261
ds->unmap_operations = stats->nr_ops[BLOCK_ACCT_UNMAP];
262
263
ds->failed_rd_operations = stats->failed_ops[BLOCK_ACCT_READ];
264
ds->failed_wr_operations = stats->failed_ops[BLOCK_ACCT_WRITE];
265
+ ds->failed_zone_append_operations =
266
+ stats->failed_ops[BLOCK_ACCT_ZONE_APPEND];
267
ds->failed_flush_operations = stats->failed_ops[BLOCK_ACCT_FLUSH];
268
ds->failed_unmap_operations = stats->failed_ops[BLOCK_ACCT_UNMAP];
269
270
ds->invalid_rd_operations = stats->invalid_ops[BLOCK_ACCT_READ];
271
ds->invalid_wr_operations = stats->invalid_ops[BLOCK_ACCT_WRITE];
272
+ ds->invalid_zone_append_operations =
273
+ stats->invalid_ops[BLOCK_ACCT_ZONE_APPEND];
274
ds->invalid_flush_operations =
275
stats->invalid_ops[BLOCK_ACCT_FLUSH];
276
ds->invalid_unmap_operations = stats->invalid_ops[BLOCK_ACCT_UNMAP];
277
278
ds->rd_merged = stats->merged[BLOCK_ACCT_READ];
279
ds->wr_merged = stats->merged[BLOCK_ACCT_WRITE];
280
+ ds->zone_append_merged = stats->merged[BLOCK_ACCT_ZONE_APPEND];
281
ds->unmap_merged = stats->merged[BLOCK_ACCT_UNMAP];
282
ds->flush_operations = stats->nr_ops[BLOCK_ACCT_FLUSH];
283
ds->wr_total_time_ns = stats->total_time_ns[BLOCK_ACCT_WRITE];
284
+ ds->zone_append_total_time_ns =
285
+ stats->total_time_ns[BLOCK_ACCT_ZONE_APPEND];
286
ds->rd_total_time_ns = stats->total_time_ns[BLOCK_ACCT_READ];
287
ds->flush_total_time_ns = stats->total_time_ns[BLOCK_ACCT_FLUSH];
288
ds->unmap_total_time_ns = stats->total_time_ns[BLOCK_ACCT_UNMAP];
289
@@ -XXX,XX +XXX,XX @@ static void bdrv_query_blk_stats(BlockDeviceStats *ds, BlockBackend *blk)
290
291
TimedAverage *rd = &ts->latency[BLOCK_ACCT_READ];
292
TimedAverage *wr = &ts->latency[BLOCK_ACCT_WRITE];
293
+ TimedAverage *zap = &ts->latency[BLOCK_ACCT_ZONE_APPEND];
294
TimedAverage *fl = &ts->latency[BLOCK_ACCT_FLUSH];
295
296
dev_stats->interval_length = ts->interval_length;
297
@@ -XXX,XX +XXX,XX @@ static void bdrv_query_blk_stats(BlockDeviceStats *ds, BlockBackend *blk)
298
dev_stats->max_wr_latency_ns = timed_average_max(wr);
299
dev_stats->avg_wr_latency_ns = timed_average_avg(wr);
300
301
+ dev_stats->min_zone_append_latency_ns = timed_average_min(zap);
302
+ dev_stats->max_zone_append_latency_ns = timed_average_max(zap);
303
+ dev_stats->avg_zone_append_latency_ns = timed_average_avg(zap);
304
+
305
dev_stats->min_flush_latency_ns = timed_average_min(fl);
306
dev_stats->max_flush_latency_ns = timed_average_max(fl);
307
dev_stats->avg_flush_latency_ns = timed_average_avg(fl);
308
@@ -XXX,XX +XXX,XX @@ static void bdrv_query_blk_stats(BlockDeviceStats *ds, BlockBackend *blk)
309
block_acct_queue_depth(ts, BLOCK_ACCT_READ);
310
dev_stats->avg_wr_queue_depth =
311
block_acct_queue_depth(ts, BLOCK_ACCT_WRITE);
312
+ dev_stats->avg_zone_append_queue_depth =
313
+ block_acct_queue_depth(ts, BLOCK_ACCT_ZONE_APPEND);
314
315
QAPI_LIST_PREPEND(ds->timed_stats, dev_stats);
77
}
316
}
78
- if (!unix_fn || !iscsi_uri) {
317
@@ -XXX,XX +XXX,XX @@ static void bdrv_query_blk_stats(BlockDeviceStats *ds, BlockBackend *blk)
318
= bdrv_latency_histogram_stats(&hgram[BLOCK_ACCT_READ]);
319
ds->wr_latency_histogram
320
= bdrv_latency_histogram_stats(&hgram[BLOCK_ACCT_WRITE]);
321
+ ds->zone_append_latency_histogram
322
+ = bdrv_latency_histogram_stats(&hgram[BLOCK_ACCT_ZONE_APPEND]);
323
ds->flush_latency_histogram
324
= bdrv_latency_histogram_stats(&hgram[BLOCK_ACCT_FLUSH]);
325
}
326
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
327
index XXXXXXX..XXXXXXX 100644
328
--- a/hw/block/virtio-blk.c
329
+++ b/hw/block/virtio-blk.c
330
@@ -XXX,XX +XXX,XX @@ static int virtio_blk_handle_zone_append(VirtIOBlockReq *req,
331
data->in_num = in_num;
332
data->zone_append_data.offset = offset;
333
qemu_iovec_init_external(&req->qiov, out_iov, out_num);
79
+
334
+
80
+ if (!iscsi_uri) {
335
+ block_acct_start(blk_get_stats(s->blk), &req->acct, len,
81
goto help;
336
+ BLOCK_ACCT_ZONE_APPEND);
82
}
337
+
83
338
blk_aio_zone_append(s->blk, &data->zone_append_data.offset, &req->qiov, 0,
84
- lsock = unix_sock_new(unix_fn);
339
virtio_blk_zone_append_complete, data);
85
- if (lsock < 0) {
340
return 0;
86
- goto err;
87
+ if (opt_socket_path) {
88
+ lsock = unix_sock_new(opt_socket_path);
89
+ if (lsock < 0) {
90
+ exit(EXIT_FAILURE);
91
+ }
92
+ } else if (opt_fdnum < 0) {
93
+ g_print("%s\n", g_option_context_get_help(context, true, NULL));
94
+ exit(EXIT_FAILURE);
95
+ } else {
96
+ lsock = opt_fdnum;
97
}
98
99
csock = accept(lsock, NULL, NULL);
100
@@ -XXX,XX +XXX,XX @@ out:
101
if (vdev_scsi) {
102
g_main_loop_unref(vdev_scsi->loop);
103
g_free(vdev_scsi);
104
- unlink(unix_fn);
105
+ unlink(opt_socket_path);
106
}
107
if (csock >= 0) {
108
close(csock);
109
@@ -XXX,XX +XXX,XX @@ out:
110
if (lsock >= 0) {
111
close(lsock);
112
}
113
- g_free(unix_fn);
114
+ g_free(opt_socket_path);
115
g_free(iscsi_uri);
116
117
return err;
118
@@ -XXX,XX +XXX,XX @@ err:
119
goto out;
120
121
help:
122
- fprintf(stderr, "Usage: %s [ -u unix_sock_path -i iscsi_uri ] | [ -h ]\n",
123
+ fprintf(stderr, "Usage: %s [ -s socket-path -i iscsi-uri -f fd -p print-capabilities ] | [ -h ]\n",
124
argv[0]);
125
- fprintf(stderr, " -u path to unix socket\n");
126
- fprintf(stderr, " -i iscsi uri for lun 0\n");
127
+ fprintf(stderr, " -s, --socket-path=SOCKET_PATH path to unix socket\n");
128
+ fprintf(stderr, " -i, --iscsi-uri=ISCSI_URI iscsi uri for lun 0\n");
129
+ fprintf(stderr, " -f, --fd=FILE_DESCRIPTOR file-descriptor\n");
130
+ fprintf(stderr, " -p, --print-capabilities=PRINT_CAPABILITIES denotes print-capabilities\n");
131
fprintf(stderr, " -h print help and quit\n");
132
133
goto err;
134
--
341
--
135
2.35.1
342
2.39.2
diff view generated by jsdifflib
New patch
1
From: Sam Li <faithilikerun@gmail.com>
1
2
3
Signed-off-by: Sam Li <faithilikerun@gmail.com>
4
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
5
Message-id: 20230407082528.18841-5-faithilikerun@gmail.com
6
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
7
---
8
hw/block/virtio-blk.c | 12 ++++++++++++
9
hw/block/trace-events | 7 +++++++
10
2 files changed, 19 insertions(+)
11
12
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/hw/block/virtio-blk.c
15
+++ b/hw/block/virtio-blk.c
16
@@ -XXX,XX +XXX,XX @@ static void virtio_blk_zone_report_complete(void *opaque, int ret)
17
int64_t nz = data->zone_report_data.nr_zones;
18
int8_t err_status = VIRTIO_BLK_S_OK;
19
20
+ trace_virtio_blk_zone_report_complete(vdev, req, nz, ret);
21
if (ret) {
22
err_status = VIRTIO_BLK_S_ZONE_INVALID_CMD;
23
goto out;
24
@@ -XXX,XX +XXX,XX @@ static void virtio_blk_handle_zone_report(VirtIOBlockReq *req,
25
nr_zones = (req->in_len - sizeof(struct virtio_blk_inhdr) -
26
sizeof(struct virtio_blk_zone_report)) /
27
sizeof(struct virtio_blk_zone_descriptor);
28
+ trace_virtio_blk_handle_zone_report(vdev, req,
29
+ offset >> BDRV_SECTOR_BITS, nr_zones);
30
31
zone_size = sizeof(BlockZoneDescriptor) * nr_zones;
32
data = g_malloc(sizeof(ZoneCmdData));
33
@@ -XXX,XX +XXX,XX @@ static void virtio_blk_zone_mgmt_complete(void *opaque, int ret)
34
{
35
VirtIOBlockReq *req = opaque;
36
VirtIOBlock *s = req->dev;
37
+ VirtIODevice *vdev = VIRTIO_DEVICE(s);
38
int8_t err_status = VIRTIO_BLK_S_OK;
39
+ trace_virtio_blk_zone_mgmt_complete(vdev, req,ret);
40
41
if (ret) {
42
err_status = VIRTIO_BLK_S_ZONE_INVALID_CMD;
43
@@ -XXX,XX +XXX,XX @@ static int virtio_blk_handle_zone_mgmt(VirtIOBlockReq *req, BlockZoneOp op)
44
/* Entire drive capacity */
45
offset = 0;
46
len = capacity;
47
+ trace_virtio_blk_handle_zone_reset_all(vdev, req, 0,
48
+ bs->total_sectors);
49
} else {
50
if (bs->bl.zone_size > capacity - offset) {
51
/* The zoned device allows the last smaller zone. */
52
@@ -XXX,XX +XXX,XX @@ static int virtio_blk_handle_zone_mgmt(VirtIOBlockReq *req, BlockZoneOp op)
53
} else {
54
len = bs->bl.zone_size;
55
}
56
+ trace_virtio_blk_handle_zone_mgmt(vdev, req, op,
57
+ offset >> BDRV_SECTOR_BITS,
58
+ len >> BDRV_SECTOR_BITS);
59
}
60
61
if (!check_zoned_request(s, offset, len, false, &err_status)) {
62
@@ -XXX,XX +XXX,XX @@ static void virtio_blk_zone_append_complete(void *opaque, int ret)
63
err_status = VIRTIO_BLK_S_ZONE_INVALID_CMD;
64
goto out;
65
}
66
+ trace_virtio_blk_zone_append_complete(vdev, req, append_sector, ret);
67
68
out:
69
aio_context_acquire(blk_get_aio_context(s->conf.conf.blk));
70
@@ -XXX,XX +XXX,XX @@ static int virtio_blk_handle_zone_append(VirtIOBlockReq *req,
71
int64_t offset = virtio_ldq_p(vdev, &req->out.sector) << BDRV_SECTOR_BITS;
72
int64_t len = iov_size(out_iov, out_num);
73
74
+ trace_virtio_blk_handle_zone_append(vdev, req, offset >> BDRV_SECTOR_BITS);
75
if (!check_zoned_request(s, offset, len, true, &err_status)) {
76
goto out;
77
}
78
diff --git a/hw/block/trace-events b/hw/block/trace-events
79
index XXXXXXX..XXXXXXX 100644
80
--- a/hw/block/trace-events
81
+++ b/hw/block/trace-events
82
@@ -XXX,XX +XXX,XX @@ pflash_write_unknown(const char *name, uint8_t cmd) "%s: unknown command 0x%02x"
83
# virtio-blk.c
84
virtio_blk_req_complete(void *vdev, void *req, int status) "vdev %p req %p status %d"
85
virtio_blk_rw_complete(void *vdev, void *req, int ret) "vdev %p req %p ret %d"
86
+virtio_blk_zone_report_complete(void *vdev, void *req, unsigned int nr_zones, int ret) "vdev %p req %p nr_zones %u ret %d"
87
+virtio_blk_zone_mgmt_complete(void *vdev, void *req, int ret) "vdev %p req %p ret %d"
88
+virtio_blk_zone_append_complete(void *vdev, void *req, int64_t sector, int ret) "vdev %p req %p, append sector 0x%" PRIx64 " ret %d"
89
virtio_blk_handle_write(void *vdev, void *req, uint64_t sector, size_t nsectors) "vdev %p req %p sector %"PRIu64" nsectors %zu"
90
virtio_blk_handle_read(void *vdev, void *req, uint64_t sector, size_t nsectors) "vdev %p req %p sector %"PRIu64" nsectors %zu"
91
virtio_blk_submit_multireq(void *vdev, void *mrb, int start, int num_reqs, uint64_t offset, size_t size, bool is_write) "vdev %p mrb %p start %d num_reqs %d offset %"PRIu64" size %zu is_write %d"
92
+virtio_blk_handle_zone_report(void *vdev, void *req, int64_t sector, unsigned int nr_zones) "vdev %p req %p sector 0x%" PRIx64 " nr_zones %u"
93
+virtio_blk_handle_zone_mgmt(void *vdev, void *req, uint8_t op, int64_t sector, int64_t len) "vdev %p req %p op 0x%x sector 0x%" PRIx64 " len 0x%" PRIx64 ""
94
+virtio_blk_handle_zone_reset_all(void *vdev, void *req, int64_t sector, int64_t len) "vdev %p req %p sector 0x%" PRIx64 " cap 0x%" PRIx64 ""
95
+virtio_blk_handle_zone_append(void *vdev, void *req, int64_t sector) "vdev %p req %p, append sector 0x%" PRIx64 ""
96
97
# hd-geometry.c
98
hd_geometry_lchs_guess(void *blk, int cyls, int heads, int secs) "blk %p LCHS %d %d %d"
99
--
100
2.39.2
diff view generated by jsdifflib
New patch
1
From: Sam Li <faithilikerun@gmail.com>
1
2
3
Add the documentation about the example of using virtio-blk driver
4
to pass the zoned block devices through to the guest.
5
6
Signed-off-by: Sam Li <faithilikerun@gmail.com>
7
Message-id: 20230407082528.18841-6-faithilikerun@gmail.com
8
[Fix Sphinx indentation error by turning command-lines into
9
pre-formatted text.
10
--Stefan]
11
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
12
---
13
docs/devel/zoned-storage.rst | 25 ++++++++++++++++++++++---
14
1 file changed, 22 insertions(+), 3 deletions(-)
15
16
diff --git a/docs/devel/zoned-storage.rst b/docs/devel/zoned-storage.rst
17
index XXXXXXX..XXXXXXX 100644
18
--- a/docs/devel/zoned-storage.rst
19
+++ b/docs/devel/zoned-storage.rst
20
@@ -XXX,XX +XXX,XX @@ When the BlockBackend's BlockLimits model reports a zoned storage device, users
21
like the virtio-blk emulation or the qemu-io-cmds.c utility can use block layer
22
APIs for zoned storage emulation or testing.
23
24
-For example, to test zone_report on a null_blk device using qemu-io is:
25
-$ path/to/qemu-io --image-opts -n driver=host_device,filename=/dev/nullb0
26
--c "zrp offset nr_zones"
27
+For example, to test zone_report on a null_blk device using qemu-io is::
28
+
29
+ $ path/to/qemu-io --image-opts -n driver=host_device,filename=/dev/nullb0 -c "zrp offset nr_zones"
30
+
31
+To expose the host's zoned block device through virtio-blk, the command line
32
+can be (includes the -device parameter)::
33
+
34
+ -blockdev node-name=drive0,driver=host_device,filename=/dev/nullb0,cache.direct=on \
35
+ -device virtio-blk-pci,drive=drive0
36
+
37
+Or only use the -drive parameter::
38
+
39
+ -driver driver=host_device,file=/dev/nullb0,if=virtio,cache.direct=on
40
+
41
+Additionally, QEMU has several ways of supporting zoned storage, including:
42
+(1) Using virtio-scsi: --device scsi-block allows for the passing through of
43
+SCSI ZBC devices, enabling the attachment of ZBC or ZAC HDDs to QEMU.
44
+(2) PCI device pass-through: While NVMe ZNS emulation is available for testing
45
+purposes, it cannot yet pass through a zoned device from the host. To pass on
46
+the NVMe ZNS device to the guest, use VFIO PCI pass the entire NVMe PCI adapter
47
+through to the guest. Likewise, an HDD HBA can be passed on to QEMU all HDDs
48
+attached to the HBA.
49
--
50
2.39.2
diff view generated by jsdifflib
New patch
1
From: Carlos Santos <casantos@redhat.com>
1
2
3
It is not useful when configuring with --enable-trace-backends=nop.
4
5
Signed-off-by: Carlos Santos <casantos@redhat.com>
6
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
7
Message-Id: <20230408010410.281263-1-casantos@redhat.com>
8
---
9
trace/meson.build | 2 +-
10
1 file changed, 1 insertion(+), 1 deletion(-)
11
12
diff --git a/trace/meson.build b/trace/meson.build
13
index XXXXXXX..XXXXXXX 100644
14
--- a/trace/meson.build
15
+++ b/trace/meson.build
16
@@ -XXX,XX +XXX,XX @@ trace_events_all = custom_target('trace-events-all',
17
input: trace_events_files,
18
command: [ 'cat', '@INPUT@' ],
19
capture: true,
20
- install: true,
21
+ install: get_option('trace_backends') != [ 'nop' ],
22
install_dir: qemu_datadir)
23
24
if 'ust' in get_option('trace_backends')
25
--
26
2.39.2
diff view generated by jsdifflib