From nobody Wed Apr 16 23:14:53 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (209.51.188.17 [209.51.188.17]) by mx.zohomail.com with SMTPS id 1549864442363511.2238553896667; Sun, 10 Feb 2019 21:54:02 -0800 (PST) Received: from localhost ([127.0.0.1]:44924 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gt4XV-0003h8-Mt for importer@patchew.org; Mon, 11 Feb 2019 00:53:57 -0500 Received: from eggs.gnu.org ([209.51.188.92]:49894) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gt4VV-0002SB-K1 for qemu-devel@nongnu.org; Mon, 11 Feb 2019 00:51:55 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gt4VP-0002tz-6p for qemu-devel@nongnu.org; Mon, 11 Feb 2019 00:51:51 -0500 Received: from mx1.redhat.com ([209.132.183.28]:40276) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gt4VB-0002mr-QT; Mon, 11 Feb 2019 00:51:34 -0500 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id ED862811AC; Mon, 11 Feb 2019 05:51:29 +0000 (UTC) Received: from localhost (unknown [10.64.242.44]) by smtp.corp.redhat.com (Postfix) with ESMTP id 61D0010021B6; Mon, 11 Feb 2019 05:51:29 +0000 (UTC) From: Stefan Hajnoczi To: qemu-devel@nongnu.org Date: Mon, 11 Feb 2019 13:50:38 +0800 Message-Id: <20190211055040.13528-8-stefanha@redhat.com> In-Reply-To: <20190211055040.13528-1-stefanha@redhat.com> References: <20190211055040.13528-1-stefanha@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Mon, 11 Feb 2019 05:51:30 +0000 (UTC) Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PULL 7/9] virtio-blk: add DISCARD and WRITE_ZEROES features X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Laurent Vivier , Peter Maydell , Thomas Huth , qemu-block@nongnu.org, "Michael S. Tsirkin" , Pankaj Gupta , Stefano Garzarella , Max Reitz , Stefan Hajnoczi , Paolo Bonzini , Kevin Wolf , Eduardo Habkost Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Type: text/plain; charset="utf-8" From: Stefano Garzarella This patch adds the support of DISCARD and WRITE_ZEROES commands, that have been introduced in the virtio-blk protocol to have better performance when using SSD backend. We support only one segment per request since multiple segments are not widely used and there are no userspace APIs that allow applications to submit multiple segments in a single call. Reviewed-by: Michael S. Tsirkin Reviewed-by: Stefan Hajnoczi Signed-off-by: Stefano Garzarella Acked-by: Pankaj Gupta Message-id: 20190208134950.187665-5-sgarzare@redhat.com Signed-off-by: Stefan Hajnoczi --- include/hw/virtio/virtio-blk.h | 2 + hw/block/virtio-blk.c | 184 +++++++++++++++++++++++++++++++++ 2 files changed, 186 insertions(+) diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h index f7345b0511..015b523fe0 100644 --- a/include/hw/virtio/virtio-blk.h +++ b/include/hw/virtio/virtio-blk.h @@ -38,6 +38,8 @@ struct VirtIOBlkConf uint32_t request_merging; uint16_t num_queues; uint16_t queue_size; + uint32_t max_discard_sectors; + uint32_t max_write_zeroes_sectors; }; =20 struct VirtIOBlockDataPlane; diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c index feeaf77965..f2b73bb8cf 100644 --- a/hw/block/virtio-blk.c +++ b/hw/block/virtio-blk.c @@ -146,6 +146,30 @@ out: aio_context_release(blk_get_aio_context(s->conf.conf.blk)); } =20 +static void virtio_blk_discard_write_zeroes_complete(void *opaque, int ret) +{ + VirtIOBlockReq *req =3D opaque; + VirtIOBlock *s =3D req->dev; + bool is_write_zeroes =3D (virtio_ldl_p(VIRTIO_DEVICE(s), &req->out.typ= e) & + ~VIRTIO_BLK_T_BARRIER) =3D=3D VIRTIO_BLK_T_WRI= TE_ZEROES; + + aio_context_acquire(blk_get_aio_context(s->conf.conf.blk)); + if (ret) { + if (virtio_blk_handle_rw_error(req, -ret, false, is_write_zeroes))= { + goto out; + } + } + + virtio_blk_req_complete(req, VIRTIO_BLK_S_OK); + if (is_write_zeroes) { + block_acct_done(blk_get_stats(s->blk), &req->acct); + } + virtio_blk_free_request(req); + +out: + aio_context_release(blk_get_aio_context(s->conf.conf.blk)); +} + #ifdef __linux__ =20 typedef struct { @@ -479,6 +503,84 @@ static bool virtio_blk_sect_range_ok(VirtIOBlock *dev, return true; } =20 +static uint8_t virtio_blk_handle_discard_write_zeroes(VirtIOBlockReq *req, + struct virtio_blk_discard_write_zeroes *dwz_hdr, bool is_write_zeroes) +{ + VirtIOBlock *s =3D req->dev; + VirtIODevice *vdev =3D VIRTIO_DEVICE(s); + uint64_t sector; + uint32_t num_sectors, flags, max_sectors; + uint8_t err_status; + int bytes; + + sector =3D virtio_ldq_p(vdev, &dwz_hdr->sector); + num_sectors =3D virtio_ldl_p(vdev, &dwz_hdr->num_sectors); + flags =3D virtio_ldl_p(vdev, &dwz_hdr->flags); + max_sectors =3D is_write_zeroes ? s->conf.max_write_zeroes_sectors : + s->conf.max_discard_sectors; + + /* + * max_sectors is at most BDRV_REQUEST_MAX_SECTORS, this check + * make us sure that "num_sectors << BDRV_SECTOR_BITS" can fit in + * the integer variable. + */ + if (unlikely(num_sectors > max_sectors)) { + err_status =3D VIRTIO_BLK_S_IOERR; + goto err; + } + + bytes =3D num_sectors << BDRV_SECTOR_BITS; + + if (unlikely(!virtio_blk_sect_range_ok(s, sector, bytes))) { + err_status =3D VIRTIO_BLK_S_IOERR; + goto err; + } + + /* + * The device MUST set the status byte to VIRTIO_BLK_S_UNSUPP for disc= ard + * and write zeroes commands if any unknown flag is set. + */ + if (unlikely(flags & ~VIRTIO_BLK_WRITE_ZEROES_FLAG_UNMAP)) { + err_status =3D VIRTIO_BLK_S_UNSUPP; + goto err; + } + + if (is_write_zeroes) { /* VIRTIO_BLK_T_WRITE_ZEROES */ + int blk_aio_flags =3D 0; + + if (flags & VIRTIO_BLK_WRITE_ZEROES_FLAG_UNMAP) { + blk_aio_flags |=3D BDRV_REQ_MAY_UNMAP; + } + + block_acct_start(blk_get_stats(s->blk), &req->acct, bytes, + BLOCK_ACCT_WRITE); + + blk_aio_pwrite_zeroes(s->blk, sector << BDRV_SECTOR_BITS, + bytes, blk_aio_flags, + virtio_blk_discard_write_zeroes_complete, re= q); + } else { /* VIRTIO_BLK_T_DISCARD */ + /* + * The device MUST set the status byte to VIRTIO_BLK_S_UNSUPP for + * discard commands if the unmap flag is set. + */ + if (unlikely(flags & VIRTIO_BLK_WRITE_ZEROES_FLAG_UNMAP)) { + err_status =3D VIRTIO_BLK_S_UNSUPP; + goto err; + } + + blk_aio_pdiscard(s->blk, sector << BDRV_SECTOR_BITS, bytes, + virtio_blk_discard_write_zeroes_complete, req); + } + + return VIRTIO_BLK_S_OK; + +err: + if (is_write_zeroes) { + block_acct_invalid(blk_get_stats(s->blk), BLOCK_ACCT_WRITE); + } + return err_status; +} + static int virtio_blk_handle_request(VirtIOBlockReq *req, MultiReqBuffer *= mrb) { uint32_t type; @@ -580,6 +682,47 @@ static int virtio_blk_handle_request(VirtIOBlockReq *r= eq, MultiReqBuffer *mrb) virtio_blk_free_request(req); break; } + /* + * VIRTIO_BLK_T_DISCARD and VIRTIO_BLK_T_WRITE_ZEROES are defined with + * VIRTIO_BLK_T_OUT flag set. We masked this flag in the switch statem= ent, + * so we must mask it for these requests, then we will check if it is = set. + */ + case VIRTIO_BLK_T_DISCARD & ~VIRTIO_BLK_T_OUT: + case VIRTIO_BLK_T_WRITE_ZEROES & ~VIRTIO_BLK_T_OUT: + { + struct virtio_blk_discard_write_zeroes dwz_hdr; + size_t out_len =3D iov_size(out_iov, out_num); + bool is_write_zeroes =3D (type & ~VIRTIO_BLK_T_BARRIER) =3D=3D + VIRTIO_BLK_T_WRITE_ZEROES; + uint8_t err_status; + + /* + * Unsupported if VIRTIO_BLK_T_OUT is not set or the request conta= ins + * more than one segment. + */ + if (unlikely(!(type & VIRTIO_BLK_T_OUT) || + out_len > sizeof(dwz_hdr))) { + virtio_blk_req_complete(req, VIRTIO_BLK_S_UNSUPP); + virtio_blk_free_request(req); + return 0; + } + + if (unlikely(iov_to_buf(out_iov, out_num, 0, &dwz_hdr, + sizeof(dwz_hdr)) !=3D sizeof(dwz_hdr))) { + virtio_error(vdev, "virtio-blk discard/write_zeroes header" + " too short"); + return -1; + } + + err_status =3D virtio_blk_handle_discard_write_zeroes(req, &dwz_hd= r, + is_write_zeroe= s); + if (err_status !=3D VIRTIO_BLK_S_OK) { + virtio_blk_req_complete(req, err_status); + virtio_blk_free_request(req); + } + + break; + } default: virtio_blk_req_complete(req, VIRTIO_BLK_S_UNSUPP); virtio_blk_free_request(req); @@ -759,6 +902,24 @@ static void virtio_blk_update_config(VirtIODevice *vde= v, uint8_t *config) blkcfg.alignment_offset =3D 0; blkcfg.wce =3D blk_enable_write_cache(s->blk); virtio_stw_p(vdev, &blkcfg.num_queues, s->conf.num_queues); + if (virtio_has_feature(s->host_features, VIRTIO_BLK_F_DISCARD)) { + virtio_stl_p(vdev, &blkcfg.max_discard_sectors, + s->conf.max_discard_sectors); + virtio_stl_p(vdev, &blkcfg.discard_sector_alignment, + blk_size >> BDRV_SECTOR_BITS); + /* + * We support only one segment per request since multiple segments + * are not widely used and there are no userspace APIs that allow + * applications to submit multiple segments in a single call. + */ + virtio_stl_p(vdev, &blkcfg.max_discard_seg, 1); + } + if (virtio_has_feature(s->host_features, VIRTIO_BLK_F_WRITE_ZEROES)) { + virtio_stl_p(vdev, &blkcfg.max_write_zeroes_sectors, + s->conf.max_write_zeroes_sectors); + blkcfg.write_zeroes_may_unmap =3D 1; + virtio_stl_p(vdev, &blkcfg.max_write_zeroes_seg, 1); + } memcpy(config, &blkcfg, sizeof(struct virtio_blk_config)); } =20 @@ -950,6 +1111,25 @@ static void virtio_blk_device_realize(DeviceState *de= v, Error **errp) return; } =20 + if (virtio_has_feature(s->host_features, VIRTIO_BLK_F_DISCARD) && + (!conf->max_discard_sectors || + conf->max_discard_sectors > BDRV_REQUEST_MAX_SECTORS)) { + error_setg(errp, "invalid max-discard-sectors property (%" PRIu32 = ")" + ", must be between 1 and %d", + conf->max_discard_sectors, (int)BDRV_REQUEST_MAX_SECTOR= S); + return; + } + + if (virtio_has_feature(s->host_features, VIRTIO_BLK_F_WRITE_ZEROES) && + (!conf->max_write_zeroes_sectors || + conf->max_write_zeroes_sectors > BDRV_REQUEST_MAX_SECTORS)) { + error_setg(errp, "invalid max-write-zeroes-sectors property (%" PR= Iu32 + "), must be between 1 and %d", + conf->max_write_zeroes_sectors, + (int)BDRV_REQUEST_MAX_SECTORS); + return; + } + virtio_init(vdev, "virtio-blk", VIRTIO_ID_BLOCK, sizeof(struct virtio_blk_config)); =20 @@ -1026,6 +1206,10 @@ static Property virtio_blk_properties[] =3D { VIRTIO_BLK_F_DISCARD, true), DEFINE_PROP_BIT64("write-zeroes", VirtIOBlock, host_features, VIRTIO_BLK_F_WRITE_ZEROES, true), + DEFINE_PROP_UINT32("max-discard-sectors", VirtIOBlock, + conf.max_discard_sectors, BDRV_REQUEST_MAX_SECTORS), + DEFINE_PROP_UINT32("max-write-zeroes-sectors", VirtIOBlock, + conf.max_write_zeroes_sectors, BDRV_REQUEST_MAX_SEC= TORS), DEFINE_PROP_END_OF_LIST(), }; =20 --=20 2.20.1