Toggle navigation
:p
atchew
Login
Currently, attaching zoned block devices (i.e., storage devices compliant to ZAC/ZBC standards) using several virtio methods doesn't work properly as zoned devices appear as regular block devices at the guest. This may cause unexpected i/o errors and, potentially, some data corruption. To be more precise, attaching a zoned device via virtio-pci-blk, virtio-scsi-pci/scsi-disk or virtio-scsi-pci/scsi-hd demonstrates the above behavior. The virtio-scsi-pci/scsi-block method works with a recent patch. The virtio-scsi-pci/scsi-generic method also appears to handle zoned devices without problems. This patch set adds code to check if the backing device that is being opened is a zoned Host Managed device. If this is the case, the patch prohibits attaching such device for all use cases lacking proper zoned support. Host Aware zoned block devices are designed to work as regular block devices at a guest system that does not support ZBD. Therefore, this patch set doesn't prohibit attachment of Host Aware devices. Considering that there is still a couple of different working ways to attach a ZBD, this patch set provides a reasonable short-term solution for this problem. What about long term? It appears to be beneficial to add proper ZBD support to virtio-blk. In order to support this use case properly, some virtio-blk protocol changes will be necessary. They are needed to allow the host code to propagate some ZBD properties that are required for virtio guest driver to configure the guest block device as ZBD, such as zoned device model, zone size and the total number of zones. Further, some support needs to be added for REPORT ZONES command as well as for zone operations, such as OPEN ZONE, CLOSE ZONE, FINISH ZONE and RESET ZONE. These additions to the protocol are relatively straightforward, but they need to be approved by the virtio TC and the whole process may take some time. ZBD support for virtio-scsi-pci/scsi-disk and virtio-scsi-pci/scsi-hd does not seem as necessary. Users will be expected to attach zoned block devices via virtio-scsi-pci/scsi-block instead. This patch set contains some Linux-specific code. This code is necessary to obtain Zoned Block Device model value from Linux sysfs. History: v1 -> v2: - rework the code to be permission-based - always allow Host Aware devices to be attached - add fix for Host Aware attachments aka RCAP output snoop v2 -> v3: - drop the patch for RCAP output snoop - merged separately Dmitry Fomichev (4): block: Add zoned device model property raw: Recognize zoned backing devices block/ide/scsi: Set BLK_PERM_SUPPORT_ZONED raw: Don't open ZBDs if backend can't handle them block.c | 19 +++++++++ block/file-posix.c | 88 +++++++++++++++++++++++++++++++++------ block/raw-format.c | 8 ++++ hw/block/block.c | 8 +++- hw/block/fdc.c | 4 +- hw/block/nvme.c | 2 +- hw/block/virtio-blk.c | 2 +- hw/block/xen-block.c | 2 +- hw/ide/qdev.c | 2 +- hw/scsi/scsi-disk.c | 13 +++--- hw/scsi/scsi-generic.c | 2 +- hw/usb/dev-storage.c | 2 +- include/block/block.h | 21 +++++++++- include/block/block_int.h | 4 ++ include/hw/block/block.h | 3 +- 15 files changed, 150 insertions(+), 30 deletions(-) -- 2.21.0
This commit adds Zoned Device Model (as defined in T10 ZBC and T13 ZAC standards) as a block driver property, along with some useful access functions. A new backend driver permission, BLK_PERM_SUPPORT_ZONED, is also introduced. Only the drivers having this permission will be allowed to open zoned block devices. No code is added yet to initialize or check the value of this new property, therefore this commit doesn't change any functionality. Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> --- block.c | 19 +++++++++++++++++++ include/block/block.h | 21 ++++++++++++++++++++- include/block/block_int.h | 4 ++++ 3 files changed, 43 insertions(+), 1 deletion(-) diff --git a/block.c b/block.c index XXXXXXX..XXXXXXX 100644 --- a/block.c +++ b/block.c @@ -XXX,XX +XXX,XX @@ void bdrv_get_geometry(BlockDriverState *bs, uint64_t *nb_sectors_ptr) *nb_sectors_ptr = nb_sectors < 0 ? 0 : nb_sectors; } +uint8_t bdrv_get_zoned_model(BlockDriverState *bs) +{ + if (bs->drv->bdrv_get_zoned_info) { + bs->drv->bdrv_get_zoned_info(bs); + } + + return bs->bl.zoned_model; +} + +uint8_t bdrv_is_zoned(BlockDriverState *bs) +{ + /* + * Host Aware zone devices are supposed to be able to work + * just like regular block devices. Thus, we only consider + * Host Managed devices to be zoned here. + */ + return bdrv_get_zoned_model(bs) == BLK_ZONED_MODEL_HM; +} + bool bdrv_is_sg(BlockDriverState *bs) { return bs->sg; diff --git a/include/block/block.h b/include/block/block.h index XXXXXXX..XXXXXXX 100644 --- a/include/block/block.h +++ b/include/block/block.h @@ -XXX,XX +XXX,XX @@ enum { */ BLK_PERM_GRAPH_MOD = 0x10, + /** This permission is required to open zoned block devices. */ + BLK_PERM_SUPPORT_ZONED = 0x20, + BLK_PERM_ALL = 0x1f, DEFAULT_PERM_PASSTHROUGH = BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE | BLK_PERM_WRITE_UNCHANGED - | BLK_PERM_RESIZE, + | BLK_PERM_RESIZE + | BLK_PERM_SUPPORT_ZONED, DEFAULT_PERM_UNCHANGED = BLK_PERM_ALL & ~DEFAULT_PERM_PASSTHROUGH, }; char *bdrv_perm_names(uint64_t perm); +/* + * Known zoned device models. + * + * TODO For a Linux host, it could be preferrable to include + * /usr/include/linux/blkzoned.h instead of defining ZBD-specific + * values here. + */ +enum blk_zoned_model { + BLK_ZONED_MODEL_NONE, /* Regular block device */ + BLK_ZONED_MODEL_HA, /* Host-aware zoned block device */ + BLK_ZONED_MODEL_HM, /* Host-managed zoned block device */ +}; + /* disk I/O throttling */ void bdrv_init(void); void bdrv_init_with_whitelist(void); @@ -XXX,XX +XXX,XX @@ int64_t bdrv_get_allocated_file_size(BlockDriverState *bs); BlockMeasureInfo *bdrv_measure(BlockDriver *drv, QemuOpts *opts, BlockDriverState *in_bs, Error **errp); void bdrv_get_geometry(BlockDriverState *bs, uint64_t *nb_sectors_ptr); +uint8_t bdrv_get_zoned_model(BlockDriverState *bs); +uint8_t bdrv_is_zoned(BlockDriverState *bs); void bdrv_refresh_limits(BlockDriverState *bs, Error **errp); int bdrv_commit(BlockDriverState *bs); int bdrv_change_backing_file(BlockDriverState *bs, diff --git a/include/block/block_int.h b/include/block/block_int.h index XXXXXXX..XXXXXXX 100644 --- a/include/block/block_int.h +++ b/include/block/block_int.h @@ -XXX,XX +XXX,XX @@ struct BlockDriver { bool (*bdrv_debug_is_suspended)(BlockDriverState *bs, const char *tag); void (*bdrv_refresh_limits)(BlockDriverState *bs, Error **errp); + void (*bdrv_get_zoned_info)(BlockDriverState *bs); /* * Returns 1 if newly created images are guaranteed to contain only @@ -XXX,XX +XXX,XX @@ typedef struct BlockLimits { /* maximum number of iovec elements */ int max_iov; + + /* Zoned device model. Zero value indicates a regular block device */ + uint8_t zoned_model; } BlockLimits; typedef struct BdrvOpBlocker BdrvOpBlocker; -- 2.21.0
The purpose of this patch is to recognize a zoned block device (ZBD) when it is opened as a raw file. The new code initializes the zoned model propery introduced by the previous commit. This commit is Linux-specific as it gets the Zoned Block Device Model value (none/host-managed/host-aware) from sysfs on the host. In order to avoid code duplication in file-posix.c, a common helper function is added to read values of sysfs entries under /sys/block/<dev>/queue. This way, the existing function that reads the value of "max_segments" entry and the the new function that reads "zoned" value both share the same helper code. Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> --- block/file-posix.c | 74 ++++++++++++++++++++++++++++++++++++++-------- block/raw-format.c | 8 +++++ 2 files changed, 70 insertions(+), 12 deletions(-) diff --git a/block/file-posix.c b/block/file-posix.c index XXXXXXX..XXXXXXX 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -XXX,XX +XXX,XX @@ static int sg_get_max_transfer_length(int fd) #endif } -static int sg_get_max_segments(int fd) +static int hdev_read_blk_queue_entry(int fd, const char *key, + char *buf, int buf_len) { #ifdef CONFIG_LINUX - char buf[32]; - const char *end; char *sysfspath = NULL; int ret; int sysfd = -1; - long max_segments; struct stat st; if (fstat(fd, &st)) { @@ -XXX,XX +XXX,XX @@ static int sg_get_max_segments(int fd) goto out; } - sysfspath = g_strdup_printf("/sys/dev/block/%u:%u/queue/max_segments", - major(st.st_rdev), minor(st.st_rdev)); + sysfspath = g_strdup_printf("/sys/dev/block/%u:%u/queue/%s", + major(st.st_rdev), minor(st.st_rdev), key); sysfd = open(sysfspath, O_RDONLY); if (sysfd == -1) { ret = -errno; goto out; } do { - ret = read(sysfd, buf, sizeof(buf) - 1); + ret = read(sysfd, buf, buf_len - 1); } while (ret == -1 && errno == EINTR); if (ret < 0) { ret = -errno; - goto out; } else if (ret == 0) { ret = -EIO; + } +out: + if (sysfd != -1) { + close(sysfd); + } + g_free(sysfspath); + return ret; +#else + return -ENOTSUP; +#endif +} + +static int sg_get_max_segments(int fd) +{ +#ifdef CONFIG_LINUX + char buf[32]; + const char *end; + int ret; + long max_segments; + + ret = hdev_read_blk_queue_entry(fd, "max_segments", buf, sizeof(buf)); + if (ret < 0) { goto out; } + buf[ret] = 0; /* The file is ended with '\n', pass 'end' to accept that. */ ret = qemu_strtol(buf, &end, 10, &max_segments); @@ -XXX,XX +XXX,XX @@ static int sg_get_max_segments(int fd) } out: - if (sysfd != -1) { - close(sysfd); + return ret; +#else + return -ENOTSUP; +#endif +} + +static int hdev_get_zoned_model(int fd) +{ +#ifdef CONFIG_LINUX + char buf[32]; + int ret; + + ret = hdev_read_blk_queue_entry(fd, "zoned", buf, sizeof(buf)); + if (ret < 0) { + ret = BLK_ZONED_MODEL_NONE; + goto out; } - g_free(sysfspath); + + buf[ret - 1] = 0; + ret = BLK_ZONED_MODEL_NONE; + if (strcmp(buf, "host-managed") == 0) { + ret = BLK_ZONED_MODEL_HM; + } else if (strcmp(buf, "host-aware") == 0) { + ret = BLK_ZONED_MODEL_HA; + } + +out: return ret; #else return -ENOTSUP; @@ -XXX,XX +XXX,XX @@ out: static void raw_refresh_limits(BlockDriverState *bs, Error **errp) { BDRVRawState *s = bs->opaque; + int ret; if (bs->sg) { - int ret = sg_get_max_transfer_length(s->fd); + ret = sg_get_max_transfer_length(s->fd); if (ret > 0 && ret <= BDRV_REQUEST_MAX_BYTES) { bs->bl.max_transfer = pow2floor(ret); @@ -XXX,XX +XXX,XX @@ static void raw_refresh_limits(BlockDriverState *bs, Error **errp) if (ret > 0) { bs->bl.max_transfer = MIN(bs->bl.max_transfer, ret * getpagesize()); } + + } + + ret = hdev_get_zoned_model(s->fd); + if (ret >= 0) { + bs->bl.zoned_model = ret; } raw_probe_alignment(bs, s->fd, errp); diff --git a/block/raw-format.c b/block/raw-format.c index XXXXXXX..XXXXXXX 100644 --- a/block/raw-format.c +++ b/block/raw-format.c @@ -XXX,XX +XXX,XX @@ static void raw_refresh_limits(BlockDriverState *bs, Error **errp) } } +static void raw_get_zoned_info(BlockDriverState *bs) +{ + if (!bs->probed) { + bs->bl.zoned_model = bs->file->bs->bl.zoned_model; + } +} + static int coroutine_fn raw_co_truncate(BlockDriverState *bs, int64_t offset, PreallocMode prealloc, Error **errp) { @@ -XXX,XX +XXX,XX @@ BlockDriver bdrv_raw = { .bdrv_co_ioctl = &raw_co_ioctl, .create_opts = &raw_create_opts, .bdrv_has_zero_init = &raw_has_zero_init, + .bdrv_get_zoned_info = &raw_get_zoned_info, .strong_runtime_opts = raw_strong_runtime_opts, .mutable_opts = mutable_opts, }; -- 2.21.0
Added a new boolean argument to blkconf_apply_backend_options() to let the common block code know whether the chosen block backend can handle zoned block devices or not. blkconf_apply_backend_options() then sets BLK_PERM_SUPPORT_ZONED permission accordingly. The raw code can then use this permission to allow or deny opening a zone device by a particular block driver. Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Acked-by: Paul Durrant <paul.durrant@citrix.com> --- hw/block/block.c | 8 ++++++-- hw/block/fdc.c | 4 ++-- hw/block/nvme.c | 2 +- hw/block/virtio-blk.c | 2 +- hw/block/xen-block.c | 2 +- hw/ide/qdev.c | 2 +- hw/scsi/scsi-disk.c | 13 +++++++------ hw/scsi/scsi-generic.c | 2 +- hw/usb/dev-storage.c | 2 +- include/hw/block/block.h | 3 ++- 10 files changed, 23 insertions(+), 17 deletions(-) diff --git a/hw/block/block.c b/hw/block/block.c index XXXXXXX..XXXXXXX 100644 --- a/hw/block/block.c +++ b/hw/block/block.c @@ -XXX,XX +XXX,XX @@ void blkconf_blocksizes(BlockConf *conf) } bool blkconf_apply_backend_options(BlockConf *conf, bool readonly, - bool resizable, Error **errp) + bool resizable, bool zoned_support, + Error **errp) { BlockBackend *blk = conf->blk; BlockdevOnError rerror, werror; @@ -XXX,XX +XXX,XX @@ bool blkconf_apply_backend_options(BlockConf *conf, bool readonly, if (!readonly) { perm |= BLK_PERM_WRITE; } + if (zoned_support) { + perm |= BLK_PERM_SUPPORT_ZONED; + } shared_perm = BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE_UNCHANGED | - BLK_PERM_GRAPH_MOD; + BLK_PERM_GRAPH_MOD | BLK_PERM_SUPPORT_ZONED; if (resizable) { shared_perm |= BLK_PERM_RESIZE; } diff --git a/hw/block/fdc.c b/hw/block/fdc.c index XXXXXXX..XXXXXXX 100644 --- a/hw/block/fdc.c +++ b/hw/block/fdc.c @@ -XXX,XX +XXX,XX @@ static void fd_change_cb(void *opaque, bool load, Error **errp) } else { if (!blkconf_apply_backend_options(drive->conf, blk_is_read_only(drive->blk), false, - errp)) { + false, errp)) { return; } } @@ -XXX,XX +XXX,XX @@ static void floppy_drive_realize(DeviceState *qdev, Error **errp) if (!blkconf_apply_backend_options(&dev->conf, blk_is_read_only(dev->conf.blk), - false, errp)) { + false, false, errp)) { return; } diff --git a/hw/block/nvme.c b/hw/block/nvme.c index XXXXXXX..XXXXXXX 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -XXX,XX +XXX,XX @@ static void nvme_realize(PCIDevice *pci_dev, Error **errp) } blkconf_blocksizes(&n->conf); if (!blkconf_apply_backend_options(&n->conf, blk_is_read_only(n->conf.blk), - false, errp)) { + false, false, errp)) { return; } diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c index XXXXXXX..XXXXXXX 100644 --- a/hw/block/virtio-blk.c +++ b/hw/block/virtio-blk.c @@ -XXX,XX +XXX,XX @@ static void virtio_blk_device_realize(DeviceState *dev, Error **errp) if (!blkconf_apply_backend_options(&conf->conf, blk_is_read_only(conf->conf.blk), true, - errp)) { + false, errp)) { return; } s->original_wce = blk_enable_write_cache(conf->conf.blk); diff --git a/hw/block/xen-block.c b/hw/block/xen-block.c index XXXXXXX..XXXXXXX 100644 --- a/hw/block/xen-block.c +++ b/hw/block/xen-block.c @@ -XXX,XX +XXX,XX @@ static void xen_block_realize(XenDevice *xendev, Error **errp) } if (!blkconf_apply_backend_options(conf, blockdev->info & VDISK_READONLY, - true, errp)) { + true, false, errp)) { return; } diff --git a/hw/ide/qdev.c b/hw/ide/qdev.c index XXXXXXX..XXXXXXX 100644 --- a/hw/ide/qdev.c +++ b/hw/ide/qdev.c @@ -XXX,XX +XXX,XX @@ static void ide_dev_initfn(IDEDevice *dev, IDEDriveKind kind, Error **errp) } } if (!blkconf_apply_backend_options(&dev->conf, kind == IDE_CD, - kind != IDE_CD, errp)) { + kind != IDE_CD, false, errp)) { return; } diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c index XXXXXXX..XXXXXXX 100644 --- a/hw/scsi/scsi-disk.c +++ b/hw/scsi/scsi-disk.c @@ -XXX,XX +XXX,XX @@ static void scsi_disk_unit_attention_reported(SCSIDevice *dev) } } -static void scsi_realize(SCSIDevice *dev, Error **errp) +static void scsi_realize(SCSIDevice *dev, bool zoned_support, Error **errp) { SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, dev); @@ -XXX,XX +XXX,XX @@ static void scsi_realize(SCSIDevice *dev, Error **errp) } if (!blkconf_apply_backend_options(&dev->conf, blk_is_read_only(s->qdev.conf.blk), - dev->type == TYPE_DISK, errp)) { + dev->type == TYPE_DISK, zoned_support, + errp)) { return; } @@ -XXX,XX +XXX,XX @@ static void scsi_hd_realize(SCSIDevice *dev, Error **errp) if (!s->product) { s->product = g_strdup("QEMU HARDDISK"); } - scsi_realize(&s->qdev, errp); + scsi_realize(&s->qdev, false, errp); if (ctx) { aio_context_release(ctx); } @@ -XXX,XX +XXX,XX @@ static void scsi_cd_realize(SCSIDevice *dev, Error **errp) if (!s->product) { s->product = g_strdup("QEMU CD-ROM"); } - scsi_realize(&s->qdev, errp); + scsi_realize(&s->qdev, false, errp); aio_context_release(ctx); } @@ -XXX,XX +XXX,XX @@ static void scsi_disk_realize(SCSIDevice *dev, Error **errp) Error *local_err = NULL; if (!dev->conf.blk) { - scsi_realize(dev, &local_err); + scsi_realize(dev, false, &local_err); assert(local_err); error_propagate(errp, local_err); return; @@ -XXX,XX +XXX,XX @@ static void scsi_block_realize(SCSIDevice *dev, Error **errp) */ s->features |= (1 << SCSI_DISK_F_NO_REMOVABLE_DEVOPS); - scsi_realize(&s->qdev, errp); + scsi_realize(&s->qdev, true, errp); scsi_generic_read_device_inquiry(&s->qdev); out: diff --git a/hw/scsi/scsi-generic.c b/hw/scsi/scsi-generic.c index XXXXXXX..XXXXXXX 100644 --- a/hw/scsi/scsi-generic.c +++ b/hw/scsi/scsi-generic.c @@ -XXX,XX +XXX,XX @@ static void scsi_generic_realize(SCSIDevice *s, Error **errp) } if (!blkconf_apply_backend_options(&s->conf, blk_is_read_only(s->conf.blk), - true, errp)) { + true, true, errp)) { return; } diff --git a/hw/usb/dev-storage.c b/hw/usb/dev-storage.c index XXXXXXX..XXXXXXX 100644 --- a/hw/usb/dev-storage.c +++ b/hw/usb/dev-storage.c @@ -XXX,XX +XXX,XX @@ static void usb_msd_storage_realize(USBDevice *dev, Error **errp) blkconf_blocksizes(&s->conf); if (!blkconf_apply_backend_options(&s->conf, blk_is_read_only(blk), true, - errp)) { + false, errp)) { return; } diff --git a/include/hw/block/block.h b/include/hw/block/block.h index XXXXXXX..XXXXXXX 100644 --- a/include/hw/block/block.h +++ b/include/hw/block/block.h @@ -XXX,XX +XXX,XX @@ bool blkconf_geometry(BlockConf *conf, int *trans, Error **errp); void blkconf_blocksizes(BlockConf *conf); bool blkconf_apply_backend_options(BlockConf *conf, bool readonly, - bool resizable, Error **errp); + bool resizable, bool zoned_support, + Error **errp); /* Hard disk geometry */ -- 2.21.0
Abort opening a zoned device as a raw file in case the chosen block backend driver lacks proper support for this type of storage. Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> --- block/file-posix.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/block/file-posix.c b/block/file-posix.c index XXXXXXX..XXXXXXX 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -XXX,XX +XXX,XX @@ static int raw_check_perm(BlockDriverState *bs, uint64_t perm, uint64_t shared, goto fail; } } + + /* + * If we are opening a zoned block device, check if the backend + * driver can properly handle such devices, abort if not. + */ + if (bdrv_is_zoned(bs) && + (shared & BLK_PERM_SUPPORT_ZONED) && + !(perm & BLK_PERM_SUPPORT_ZONED)) { + error_setg(errp, + "block backend driver doesn't support HM zoned devices"); + ret = -ENOTSUP; + goto fail; + } + return 0; fail: -- 2.21.0
Ping... Any objections to merging this patchset? Ask me if you are not sure how to validate these patches without having the hardware :) Currently, attaching zoned block devices (i.e., storage devices compliant to ZAC/ZBC standards) using several virtio methods doesn't work properly as zoned devices appear as regular block devices at the guest. This may cause unexpected i/o errors and, potentially, some data corruption. To be more precise, attaching a zoned device via virtio-pci-blk, virtio-scsi-pci/scsi-disk or virtio-scsi-pci/scsi-hd demonstrates the above behavior. The virtio-scsi-pci/scsi-block method works with a recent patch. The virtio-scsi-pci/scsi-generic method also appears to handle zoned devices without problems. This patch set adds code to check if the backing device that is being opened is a zoned Host Managed device. If this is the case, the patch prohibits attaching such device for all use cases lacking proper zoned support. Host Aware zoned block devices are designed to work as regular block devices at a guest system that does not support ZBD. Therefore, this patch set doesn't prohibit attachment of Host Aware devices. Considering that there is still a couple of different working ways to attach a ZBD, this patch set provides a reasonable short-term solution for this problem. ZBD support for virtio-scsi-pci/scsi-disk and virtio-scsi-pci/scsi-hd does not seem as necessary. Users will be expected to attach zoned block devices via virtio-scsi-pci/scsi-block instead. This patch set contains some Linux-specific code. This code is necessary to obtain Zoned Block Device model value from Linux sysfs. History: v1 -> v2: - rework code to be permission-based - always allow Host Aware devices to be attached - add fix for Host Aware attachments aka RCAP output snoop v2 -> v3: - drop the patch for RCAP output snoop - merged separately v3 -> v4: - rebase to the current code Dmitry Fomichev (4): block: Add zoned device model property raw: Recognize zoned backing devices block/ide/scsi: Set BLK_PERM_SUPPORT_ZONED raw: Don't open ZBDs if backend can't handle them block.c | 19 +++++++++ block/file-posix.c | 88 +++++++++++++++++++++++++++++++++------ block/raw-format.c | 8 ++++ hw/block/block.c | 8 +++- hw/block/fdc.c | 5 ++- hw/block/nvme.c | 2 +- hw/block/virtio-blk.c | 2 +- hw/block/xen-block.c | 2 +- hw/ide/qdev.c | 2 +- hw/scsi/scsi-disk.c | 13 +++--- hw/scsi/scsi-generic.c | 2 +- hw/usb/dev-storage.c | 2 +- include/block/block.h | 21 +++++++++- include/block/block_int.h | 4 ++ include/hw/block/block.h | 3 +- 15 files changed, 151 insertions(+), 30 deletions(-) -- 2.21.0
This commit adds Zoned Device Model (as defined in T10 ZBC and T13 ZAC standards) as a block driver property, along with some useful access functions. A new backend driver permission, BLK_PERM_SUPPORT_ZONED, is also introduced. Only the drivers having this permission will be allowed to open zoned block devices. No code is added yet to initialize or check the value of this new property, therefore this commit doesn't change any functionality. Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> --- block.c | 19 +++++++++++++++++++ include/block/block.h | 21 ++++++++++++++++++++- include/block/block_int.h | 4 ++++ 3 files changed, 43 insertions(+), 1 deletion(-) diff --git a/block.c b/block.c index XXXXXXX..XXXXXXX 100644 --- a/block.c +++ b/block.c @@ -XXX,XX +XXX,XX @@ void bdrv_get_geometry(BlockDriverState *bs, uint64_t *nb_sectors_ptr) *nb_sectors_ptr = nb_sectors < 0 ? 0 : nb_sectors; } +uint8_t bdrv_get_zoned_model(BlockDriverState *bs) +{ + if (bs->drv->bdrv_get_zoned_info) { + bs->drv->bdrv_get_zoned_info(bs); + } + + return bs->bl.zoned_model; +} + +uint8_t bdrv_is_zoned(BlockDriverState *bs) +{ + /* + * Host Aware zone devices are supposed to be able to work + * just like regular block devices. Thus, we only consider + * Host Managed devices to be zoned here. + */ + return bdrv_get_zoned_model(bs) == BLK_ZONED_MODEL_HM; +} + bool bdrv_is_sg(BlockDriverState *bs) { return bs->sg; diff --git a/include/block/block.h b/include/block/block.h index XXXXXXX..XXXXXXX 100644 --- a/include/block/block.h +++ b/include/block/block.h @@ -XXX,XX +XXX,XX @@ enum { */ BLK_PERM_GRAPH_MOD = 0x10, + /** This permission is required to open zoned block devices. */ + BLK_PERM_SUPPORT_ZONED = 0x20, + BLK_PERM_ALL = 0x1f, DEFAULT_PERM_PASSTHROUGH = BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE | BLK_PERM_WRITE_UNCHANGED - | BLK_PERM_RESIZE, + | BLK_PERM_RESIZE + | BLK_PERM_SUPPORT_ZONED, DEFAULT_PERM_UNCHANGED = BLK_PERM_ALL & ~DEFAULT_PERM_PASSTHROUGH, }; char *bdrv_perm_names(uint64_t perm); +/* + * Known zoned device models. + * + * TODO For a Linux host, it could be preferrable to include + * /usr/include/linux/blkzoned.h instead of defining ZBD-specific + * values here. + */ +enum blk_zoned_model { + BLK_ZONED_MODEL_NONE, /* Regular block device */ + BLK_ZONED_MODEL_HA, /* Host-aware zoned block device */ + BLK_ZONED_MODEL_HM, /* Host-managed zoned block device */ +}; + /* disk I/O throttling */ void bdrv_init(void); void bdrv_init_with_whitelist(void); @@ -XXX,XX +XXX,XX @@ int64_t bdrv_get_allocated_file_size(BlockDriverState *bs); BlockMeasureInfo *bdrv_measure(BlockDriver *drv, QemuOpts *opts, BlockDriverState *in_bs, Error **errp); void bdrv_get_geometry(BlockDriverState *bs, uint64_t *nb_sectors_ptr); +uint8_t bdrv_get_zoned_model(BlockDriverState *bs); +uint8_t bdrv_is_zoned(BlockDriverState *bs); void bdrv_refresh_limits(BlockDriverState *bs, Error **errp); int bdrv_commit(BlockDriverState *bs); int bdrv_change_backing_file(BlockDriverState *bs, diff --git a/include/block/block_int.h b/include/block/block_int.h index XXXXXXX..XXXXXXX 100644 --- a/include/block/block_int.h +++ b/include/block/block_int.h @@ -XXX,XX +XXX,XX @@ struct BlockDriver { bool (*bdrv_debug_is_suspended)(BlockDriverState *bs, const char *tag); void (*bdrv_refresh_limits)(BlockDriverState *bs, Error **errp); + void (*bdrv_get_zoned_info)(BlockDriverState *bs); /* * Returns 1 if newly created images are guaranteed to contain only @@ -XXX,XX +XXX,XX @@ typedef struct BlockLimits { /* maximum number of iovec elements */ int max_iov; + + /* Zoned device model. Zero value indicates a regular block device */ + uint8_t zoned_model; } BlockLimits; typedef struct BdrvOpBlocker BdrvOpBlocker; -- 2.21.0
The purpose of this patch is to recognize a zoned block device (ZBD) when it is opened as a raw file. The new code initializes the zoned model propery introduced by the previous commit. This commit is Linux-specific as it gets the Zoned Block Device Model value (none/host-managed/host-aware) from sysfs on the host. In order to avoid code duplication in file-posix.c, a common helper function is added to read values of sysfs entries under /sys/block/<dev>/queue. This way, the existing function that reads the value of "max_segments" entry and the the new function that reads "zoned" value both share the same helper code. Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> --- block/file-posix.c | 74 ++++++++++++++++++++++++++++++++++++++-------- block/raw-format.c | 8 +++++ 2 files changed, 70 insertions(+), 12 deletions(-) diff --git a/block/file-posix.c b/block/file-posix.c index XXXXXXX..XXXXXXX 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -XXX,XX +XXX,XX @@ static int sg_get_max_transfer_length(int fd) #endif } -static int sg_get_max_segments(int fd) +static int hdev_read_blk_queue_entry(int fd, const char *key, + char *buf, int buf_len) { #ifdef CONFIG_LINUX - char buf[32]; - const char *end; char *sysfspath = NULL; int ret; int sysfd = -1; - long max_segments; struct stat st; if (fstat(fd, &st)) { @@ -XXX,XX +XXX,XX @@ static int sg_get_max_segments(int fd) goto out; } - sysfspath = g_strdup_printf("/sys/dev/block/%u:%u/queue/max_segments", - major(st.st_rdev), minor(st.st_rdev)); + sysfspath = g_strdup_printf("/sys/dev/block/%u:%u/queue/%s", + major(st.st_rdev), minor(st.st_rdev), key); sysfd = open(sysfspath, O_RDONLY); if (sysfd == -1) { ret = -errno; goto out; } do { - ret = read(sysfd, buf, sizeof(buf) - 1); + ret = read(sysfd, buf, buf_len - 1); } while (ret == -1 && errno == EINTR); if (ret < 0) { ret = -errno; - goto out; } else if (ret == 0) { ret = -EIO; + } +out: + if (sysfd != -1) { + close(sysfd); + } + g_free(sysfspath); + return ret; +#else + return -ENOTSUP; +#endif +} + +static int sg_get_max_segments(int fd) +{ +#ifdef CONFIG_LINUX + char buf[32]; + const char *end; + int ret; + long max_segments; + + ret = hdev_read_blk_queue_entry(fd, "max_segments", buf, sizeof(buf)); + if (ret < 0) { goto out; } + buf[ret] = 0; /* The file is ended with '\n', pass 'end' to accept that. */ ret = qemu_strtol(buf, &end, 10, &max_segments); @@ -XXX,XX +XXX,XX @@ static int sg_get_max_segments(int fd) } out: - if (sysfd != -1) { - close(sysfd); + return ret; +#else + return -ENOTSUP; +#endif +} + +static int hdev_get_zoned_model(int fd) +{ +#ifdef CONFIG_LINUX + char buf[32]; + int ret; + + ret = hdev_read_blk_queue_entry(fd, "zoned", buf, sizeof(buf)); + if (ret < 0) { + ret = BLK_ZONED_MODEL_NONE; + goto out; } - g_free(sysfspath); + + buf[ret - 1] = 0; + ret = BLK_ZONED_MODEL_NONE; + if (strcmp(buf, "host-managed") == 0) { + ret = BLK_ZONED_MODEL_HM; + } else if (strcmp(buf, "host-aware") == 0) { + ret = BLK_ZONED_MODEL_HA; + } + +out: return ret; #else return -ENOTSUP; @@ -XXX,XX +XXX,XX @@ out: static void raw_refresh_limits(BlockDriverState *bs, Error **errp) { BDRVRawState *s = bs->opaque; + int ret; if (bs->sg) { - int ret = sg_get_max_transfer_length(s->fd); + ret = sg_get_max_transfer_length(s->fd); if (ret > 0 && ret <= BDRV_REQUEST_MAX_BYTES) { bs->bl.max_transfer = pow2floor(ret); @@ -XXX,XX +XXX,XX @@ static void raw_refresh_limits(BlockDriverState *bs, Error **errp) if (ret > 0) { bs->bl.max_transfer = MIN(bs->bl.max_transfer, ret * getpagesize()); } + + } + + ret = hdev_get_zoned_model(s->fd); + if (ret >= 0) { + bs->bl.zoned_model = ret; } raw_probe_alignment(bs, s->fd, errp); diff --git a/block/raw-format.c b/block/raw-format.c index XXXXXXX..XXXXXXX 100644 --- a/block/raw-format.c +++ b/block/raw-format.c @@ -XXX,XX +XXX,XX @@ static void raw_refresh_limits(BlockDriverState *bs, Error **errp) } } +static void raw_get_zoned_info(BlockDriverState *bs) +{ + if (!bs->probed) { + bs->bl.zoned_model = bs->file->bs->bl.zoned_model; + } +} + static int coroutine_fn raw_co_truncate(BlockDriverState *bs, int64_t offset, PreallocMode prealloc, Error **errp) { @@ -XXX,XX +XXX,XX @@ BlockDriver bdrv_raw = { .create_opts = &raw_create_opts, .bdrv_has_zero_init = &raw_has_zero_init, .bdrv_has_zero_init_truncate = &raw_has_zero_init_truncate, + .bdrv_get_zoned_info = &raw_get_zoned_info, .strong_runtime_opts = raw_strong_runtime_opts, .mutable_opts = mutable_opts, }; -- 2.21.0
Added a new boolean argument to blkconf_apply_backend_options() to let the common block code know whether the chosen block backend can handle zoned block devices or not. blkconf_apply_backend_options() then sets BLK_PERM_SUPPORT_ZONED permission accordingly. The raw code can then use this permission to allow or deny opening a zone device by a particular block driver. Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Acked-by: Paul Durrant <paul.durrant@citrix.com> --- hw/block/block.c | 8 ++++++-- hw/block/fdc.c | 5 +++-- hw/block/nvme.c | 2 +- hw/block/virtio-blk.c | 2 +- hw/block/xen-block.c | 2 +- hw/ide/qdev.c | 2 +- hw/scsi/scsi-disk.c | 13 +++++++------ hw/scsi/scsi-generic.c | 2 +- hw/usb/dev-storage.c | 2 +- include/hw/block/block.h | 3 ++- 10 files changed, 24 insertions(+), 17 deletions(-) diff --git a/hw/block/block.c b/hw/block/block.c index XXXXXXX..XXXXXXX 100644 --- a/hw/block/block.c +++ b/hw/block/block.c @@ -XXX,XX +XXX,XX @@ void blkconf_blocksizes(BlockConf *conf) } bool blkconf_apply_backend_options(BlockConf *conf, bool readonly, - bool resizable, Error **errp) + bool resizable, bool zoned_support, + Error **errp) { BlockBackend *blk = conf->blk; BlockdevOnError rerror, werror; @@ -XXX,XX +XXX,XX @@ bool blkconf_apply_backend_options(BlockConf *conf, bool readonly, if (!readonly) { perm |= BLK_PERM_WRITE; } + if (zoned_support) { + perm |= BLK_PERM_SUPPORT_ZONED; + } shared_perm = BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE_UNCHANGED | - BLK_PERM_GRAPH_MOD; + BLK_PERM_GRAPH_MOD | BLK_PERM_SUPPORT_ZONED; if (resizable) { shared_perm |= BLK_PERM_RESIZE; } diff --git a/hw/block/fdc.c b/hw/block/fdc.c index XXXXXXX..XXXXXXX 100644 --- a/hw/block/fdc.c +++ b/hw/block/fdc.c @@ -XXX,XX +XXX,XX @@ static void fd_change_cb(void *opaque, bool load, Error **errp) } else { if (!blkconf_apply_backend_options(drive->conf, blk_is_read_only(drive->blk), false, - errp)) { + false, errp)) { return; } } @@ -XXX,XX +XXX,XX @@ static void floppy_drive_realize(DeviceState *qdev, Error **errp) dev->conf.rerror = BLOCKDEV_ON_ERROR_AUTO; dev->conf.werror = BLOCKDEV_ON_ERROR_AUTO; - if (!blkconf_apply_backend_options(&dev->conf, read_only, false, errp)) { + if (!blkconf_apply_backend_options(&dev->conf, read_only, false, false, + errp)) { return; } diff --git a/hw/block/nvme.c b/hw/block/nvme.c index XXXXXXX..XXXXXXX 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -XXX,XX +XXX,XX @@ static void nvme_realize(PCIDevice *pci_dev, Error **errp) } blkconf_blocksizes(&n->conf); if (!blkconf_apply_backend_options(&n->conf, blk_is_read_only(n->conf.blk), - false, errp)) { + false, false, errp)) { return; } diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c index XXXXXXX..XXXXXXX 100644 --- a/hw/block/virtio-blk.c +++ b/hw/block/virtio-blk.c @@ -XXX,XX +XXX,XX @@ static void virtio_blk_device_realize(DeviceState *dev, Error **errp) if (!blkconf_apply_backend_options(&conf->conf, blk_is_read_only(conf->conf.blk), true, - errp)) { + false, errp)) { return; } s->original_wce = blk_enable_write_cache(conf->conf.blk); diff --git a/hw/block/xen-block.c b/hw/block/xen-block.c index XXXXXXX..XXXXXXX 100644 --- a/hw/block/xen-block.c +++ b/hw/block/xen-block.c @@ -XXX,XX +XXX,XX @@ static void xen_block_realize(XenDevice *xendev, Error **errp) } if (!blkconf_apply_backend_options(conf, blockdev->info & VDISK_READONLY, - true, errp)) { + true, false, errp)) { return; } diff --git a/hw/ide/qdev.c b/hw/ide/qdev.c index XXXXXXX..XXXXXXX 100644 --- a/hw/ide/qdev.c +++ b/hw/ide/qdev.c @@ -XXX,XX +XXX,XX @@ static void ide_dev_initfn(IDEDevice *dev, IDEDriveKind kind, Error **errp) } } if (!blkconf_apply_backend_options(&dev->conf, kind == IDE_CD, - kind != IDE_CD, errp)) { + kind != IDE_CD, false, errp)) { return; } diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c index XXXXXXX..XXXXXXX 100644 --- a/hw/scsi/scsi-disk.c +++ b/hw/scsi/scsi-disk.c @@ -XXX,XX +XXX,XX @@ static void scsi_disk_unit_attention_reported(SCSIDevice *dev) } } -static void scsi_realize(SCSIDevice *dev, Error **errp) +static void scsi_realize(SCSIDevice *dev, bool zoned_support, Error **errp) { SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, dev); bool read_only; @@ -XXX,XX +XXX,XX @@ static void scsi_realize(SCSIDevice *dev, Error **errp) } if (!blkconf_apply_backend_options(&dev->conf, read_only, - dev->type == TYPE_DISK, errp)) { + dev->type == TYPE_DISK, zoned_support, + errp)) { return; } @@ -XXX,XX +XXX,XX @@ static void scsi_hd_realize(SCSIDevice *dev, Error **errp) if (!s->product) { s->product = g_strdup("QEMU HARDDISK"); } - scsi_realize(&s->qdev, errp); + scsi_realize(&s->qdev, false, errp); if (ctx) { aio_context_release(ctx); } @@ -XXX,XX +XXX,XX @@ static void scsi_cd_realize(SCSIDevice *dev, Error **errp) if (!s->product) { s->product = g_strdup("QEMU CD-ROM"); } - scsi_realize(&s->qdev, errp); + scsi_realize(&s->qdev, false, errp); aio_context_release(ctx); } @@ -XXX,XX +XXX,XX @@ static void scsi_disk_realize(SCSIDevice *dev, Error **errp) Error *local_err = NULL; if (!dev->conf.blk) { - scsi_realize(dev, &local_err); + scsi_realize(dev, false, &local_err); assert(local_err); error_propagate(errp, local_err); return; @@ -XXX,XX +XXX,XX @@ static void scsi_block_realize(SCSIDevice *dev, Error **errp) */ s->features |= (1 << SCSI_DISK_F_NO_REMOVABLE_DEVOPS); - scsi_realize(&s->qdev, errp); + scsi_realize(&s->qdev, true, errp); scsi_generic_read_device_inquiry(&s->qdev); out: diff --git a/hw/scsi/scsi-generic.c b/hw/scsi/scsi-generic.c index XXXXXXX..XXXXXXX 100644 --- a/hw/scsi/scsi-generic.c +++ b/hw/scsi/scsi-generic.c @@ -XXX,XX +XXX,XX @@ static void scsi_generic_realize(SCSIDevice *s, Error **errp) } if (!blkconf_apply_backend_options(&s->conf, blk_is_read_only(s->conf.blk), - true, errp)) { + true, true, errp)) { return; } diff --git a/hw/usb/dev-storage.c b/hw/usb/dev-storage.c index XXXXXXX..XXXXXXX 100644 --- a/hw/usb/dev-storage.c +++ b/hw/usb/dev-storage.c @@ -XXX,XX +XXX,XX @@ static void usb_msd_storage_realize(USBDevice *dev, Error **errp) blkconf_blocksizes(&s->conf); if (!blkconf_apply_backend_options(&s->conf, blk_is_read_only(blk), true, - errp)) { + false, errp)) { return; } diff --git a/include/hw/block/block.h b/include/hw/block/block.h index XXXXXXX..XXXXXXX 100644 --- a/include/hw/block/block.h +++ b/include/hw/block/block.h @@ -XXX,XX +XXX,XX @@ bool blkconf_geometry(BlockConf *conf, int *trans, Error **errp); void blkconf_blocksizes(BlockConf *conf); bool blkconf_apply_backend_options(BlockConf *conf, bool readonly, - bool resizable, Error **errp); + bool resizable, bool zoned_support, + Error **errp); /* Hard disk geometry */ -- 2.21.0
Abort opening a zoned device as a raw file in case the chosen block backend driver lacks proper support for this type of storage. Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> --- block/file-posix.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/block/file-posix.c b/block/file-posix.c index XXXXXXX..XXXXXXX 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -XXX,XX +XXX,XX @@ static int raw_check_perm(BlockDriverState *bs, uint64_t perm, uint64_t shared, goto fail; } } + + /* + * If we are opening a zoned block device, check if the backend + * driver can properly handle such devices, abort if not. + */ + if (bdrv_is_zoned(bs) && + (shared & BLK_PERM_SUPPORT_ZONED) && + !(perm & BLK_PERM_SUPPORT_ZONED)) { + error_setg(errp, + "block backend driver doesn't support HM zoned devices"); + ret = -ENOTSUP; + goto fail; + } + return 0; fail: -- 2.21.0