From nobody Mon Feb 9 11:51:31 2026 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1586931804; cv=none; d=zohomail.com; s=zohoarc; b=D66Is6Kcll0TmQCeKm+XPRkYjyGZJ9z4sq5ZFHr+dgqnZZQPxnI9Y8h9HLcNUgsapDWSi/t6L2poIs25fAFE0mhFQtXhWujztgjfwuaqRvpd6QeLP0EaUi14BDrFusTD1gl5JyA3OHgt/Nmq0gHtr8/bTcSGiTMOsPwlNRCEKL0= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1586931804; h=Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=p5uCIRHZ70bbD/S0lVxRk7wXhihJIDk8s6iXf/CUAx4=; b=T2kLU4Aopv6nVgSyZRVwJjaOwygRsqxWq7yNpJHLDAr6a9rgG5VZX47a4zbqTYzpMl222eCrI4AOtFlEuGJ5ohsCHJbDPMPn1P7CRH56U05LrtPZ02F00/LkstiX7O2yYk2iGQDrBx+qo5KbftTmGLOiB8JodguoIpyaPYMCTkc= ARC-Authentication-Results: i=1; mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1586931804839250.67196173101718; Tue, 14 Apr 2020 23:23:24 -0700 (PDT) Received: from localhost ([::1]:44050 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jObSF-0002bx-D6 for importer@patchew.org; Wed, 15 Apr 2020 02:23:23 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:35758) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jOazA-0003Bq-JX for qemu-devel@nongnu.org; Wed, 15 Apr 2020 01:53:24 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jOaz6-0002w8-0k for qemu-devel@nongnu.org; Wed, 15 Apr 2020 01:53:19 -0400 Received: from charlie.dont.surf ([128.199.63.193]:47582) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1jOayw-0002bJ-S6; Wed, 15 Apr 2020 01:53:07 -0400 Received: from apples.local (80-167-98-190-cable.dk.customer.tdc.net [80.167.98.190]) by charlie.dont.surf (Postfix) with ESMTPSA id 43330BF9F3; Wed, 15 Apr 2020 05:52:31 +0000 (UTC) From: Klaus Jensen To: qemu-block@nongnu.org Subject: [PATCH v7 36/48] nvme: allow multiple aios per command Date: Wed, 15 Apr 2020 07:51:28 +0200 Message-Id: <20200415055140.466900-37-its@irrelevant.dk> X-Mailer: git-send-email 2.26.0 In-Reply-To: <20200415055140.466900-1-its@irrelevant.dk> References: <20200415055140.466900-1-its@irrelevant.dk> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 128.199.63.193 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Beata Michalska , Klaus Jensen , qemu-devel@nongnu.org, Max Reitz , Klaus Jensen , Keith Busch , Javier Gonzalez , Maxim Levitsky Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Type: text/plain; charset="utf-8" From: Klaus Jensen This refactors how the device issues asynchronous block backend requests. The NvmeRequest now holds a queue of NvmeAIOs that are associated with the command. This allows multiple aios to be issued for a command. Only when all requests have been completed will the device post a completion queue entry. Because the device is currently guaranteed to only issue a single aio request per command, the benefit is not immediately obvious. But this functionality is required to support metadata, the dataset management command and other features. Signed-off-by: Klaus Jensen Signed-off-by: Klaus Jensen Acked-by: Keith Busch --- hw/block/nvme.c | 328 +++++++++++++++++++++++++++++++++--------- hw/block/nvme.h | 101 +++++++++++-- hw/block/trace-events | 3 + 3 files changed, 350 insertions(+), 82 deletions(-) diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 846aa31eaae9..c123be10fd0d 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -59,6 +59,7 @@ } while (0) =20 static void nvme_process_sq(void *opaque); +static void nvme_aio_cb(void *opaque, int ret); =20 static inline void *nvme_addr_to_cmb(NvmeCtrl *n, hwaddr addr) { @@ -163,6 +164,17 @@ static void nvme_req_clear(NvmeRequest *req) { req->ns =3D NULL; memset(&req->cqe, 0x0, sizeof(req->cqe)); + req->status =3D NVME_SUCCESS; + req->slba =3D req->nlb =3D 0x0; + req->cb =3D req->cb_arg =3D NULL; + + if (req->qsg.sg) { + qemu_sglist_destroy(&req->qsg); + } + + if (req->iov.iov) { + qemu_iovec_destroy(&req->iov); + } } =20 static uint16_t nvme_map_addr_cmb(NvmeCtrl *n, QEMUIOVector *iov, hwaddr a= ddr, @@ -388,6 +400,109 @@ static uint16_t nvme_map(NvmeCtrl *n, QEMUSGList *qsg= , QEMUIOVector *iov, return nvme_map_prp(n, qsg, iov, prp1, prp2, len, req); } =20 +static void nvme_aio_destroy(NvmeAIO *aio) +{ + g_free(aio); +} + +/* + * Register an asynchronous I/O operation with the NvmeRequest. The NvmeRe= quest + * will not complete until all registered AIO's have completed and the + * aio_tailq goes empty. + */ +static inline void nvme_req_add_aio(NvmeRequest *req, NvmeAIO *aio, + NvmeAIOOp opc) +{ + aio->opc =3D opc; + + trace_nvme_dev_req_add_aio(nvme_cid(req), aio, blk_name(aio->blk), + aio->offset, aio->len, + nvme_aio_opc_str(aio), req); + + if (req) { + QTAILQ_INSERT_TAIL(&req->aio_tailq, aio, tailq_entry); + } +} + +/* + * Submit an asynchronous I/O operation as described by the given NvmeAIO.= This + * function takes care of accounting and special handling of reads and wri= tes + * going to the Controller Memory Buffer. + */ +static void nvme_submit_aio(NvmeAIO *aio) +{ + BlockBackend *blk =3D aio->blk; + BlockAcctCookie *acct =3D &aio->acct; + BlockAcctStats *stats =3D blk_get_stats(blk); + + bool is_write; + + switch (aio->opc) { + case NVME_AIO_OPC_NONE: + break; + + case NVME_AIO_OPC_FLUSH: + block_acct_start(stats, acct, 0, BLOCK_ACCT_FLUSH); + aio->aiocb =3D blk_aio_flush(blk, nvme_aio_cb, aio); + break; + + case NVME_AIO_OPC_WRITE_ZEROES: + block_acct_start(stats, acct, aio->len, BLOCK_ACCT_WRITE); + aio->aiocb =3D blk_aio_pwrite_zeroes(blk, aio->offset, aio->len, + BDRV_REQ_MAY_UNMAP, nvme_aio_cb, + aio); + break; + + case NVME_AIO_OPC_READ: + case NVME_AIO_OPC_WRITE: + is_write =3D (aio->opc =3D=3D NVME_AIO_OPC_WRITE); + + block_acct_start(stats, acct, aio->len, + is_write ? BLOCK_ACCT_WRITE : BLOCK_ACCT_READ); + + if (aio->qsg) { + if (is_write) { + aio->aiocb =3D dma_blk_write(blk, aio->qsg, aio->offset, + BDRV_SECTOR_SIZE, nvme_aio_cb, = aio); + } else { + aio->aiocb =3D dma_blk_read(blk, aio->qsg, aio->offset, + BDRV_SECTOR_SIZE, nvme_aio_cb, a= io); + } + } else { + if (is_write) { + aio->aiocb =3D blk_aio_pwritev(blk, aio->offset, aio->iov,= 0, + nvme_aio_cb, aio); + } else { + aio->aiocb =3D blk_aio_preadv(blk, aio->offset, aio->iov, = 0, + nvme_aio_cb, aio); + } + } + + break; + } +} + +static void nvme_rw_aio(BlockBackend *blk, uint64_t offset, NvmeRequest *r= eq) +{ + NvmeAIO *aio; + size_t len =3D req->qsg.nsg > 0 ? req->qsg.size : req->iov.size; + + aio =3D g_new0(NvmeAIO, 1); + + *aio =3D (NvmeAIO) { + .blk =3D blk, + .offset =3D offset, + .len =3D len, + .req =3D req, + .qsg =3D req->qsg.sg ? &req->qsg : NULL, + .iov =3D req->iov.iov ? &req->iov : NULL, + }; + + nvme_req_add_aio(req, aio, nvme_req_is_write(req) ? + NVME_AIO_OPC_WRITE : NVME_AIO_OPC_READ); + nvme_submit_aio(aio); +} + static void nvme_post_cqes(void *opaque) { NvmeCQueue *cq =3D opaque; @@ -424,6 +539,7 @@ static void nvme_enqueue_req_completion(NvmeCQueue *cq,= NvmeRequest *req) assert(cq->cqid =3D=3D req->sq->cqid); trace_nvme_dev_enqueue_req_completion(nvme_cid(req), cq->cqid, req->status); + QTAILQ_REMOVE(&req->sq->out_req_list, req, entry); QTAILQ_INSERT_TAIL(&cq->req_list, req, entry); timer_mod(cq->timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + 500); @@ -535,38 +651,104 @@ static inline uint16_t nvme_check_bounds(NvmeCtrl *n= , uint64_t slba, return NVME_SUCCESS; } =20 -static void nvme_rw_cb(void *opaque, int ret) +static void nvme_rw_cb(NvmeRequest *req, void *opaque) { - NvmeRequest *req =3D opaque; NvmeSQueue *sq =3D req->sq; NvmeCtrl *n =3D sq->ctrl; NvmeCQueue *cq =3D n->cq[sq->cqid]; =20 trace_nvme_dev_rw_cb(nvme_cid(req)); =20 - if (!ret) { - block_acct_done(blk_get_stats(n->conf.blk), &req->acct); - req->status =3D NVME_SUCCESS; - } else { - block_acct_failed(blk_get_stats(n->conf.blk), &req->acct); - req->status =3D NVME_INTERNAL_DEV_ERROR; - } - - if (req->qsg.nalloc) { - qemu_sglist_destroy(&req->qsg); - } - if (req->iov.nalloc) { - qemu_iovec_destroy(&req->iov); - } - nvme_enqueue_req_completion(cq, req); } =20 +static void nvme_aio_cb(void *opaque, int ret) +{ + NvmeAIO *aio =3D opaque; + NvmeRequest *req =3D aio->req; + + BlockBackend *blk =3D aio->blk; + BlockAcctCookie *acct =3D &aio->acct; + BlockAcctStats *stats =3D blk_get_stats(blk); + + Error *local_err =3D NULL; + + trace_nvme_dev_aio_cb(nvme_cid(req), aio, blk_name(blk), aio->offset, + nvme_aio_opc_str(aio), req); + + if (req) { + QTAILQ_REMOVE(&req->aio_tailq, aio, tailq_entry); + } + + if (!ret) { + block_acct_done(stats, acct); + } else { + block_acct_failed(stats, acct); + + if (req) { + uint16_t status; + + switch (aio->opc) { + case NVME_AIO_OPC_READ: + status =3D NVME_UNRECOVERED_READ; + break; + case NVME_AIO_OPC_WRITE: + case NVME_AIO_OPC_WRITE_ZEROES: + status =3D NVME_WRITE_FAULT; + break; + default: + status =3D NVME_INTERNAL_DEV_ERROR; + break; + } + + trace_nvme_dev_err_aio(nvme_cid(req), aio, blk_name(blk), + aio->offset, nvme_aio_opc_str(aio), req, + status); + + error_setg_errno(&local_err, -ret, "aio failed"); + error_report_err(local_err); + + /* + * An Internal Error trumps all other errors. For other errors, + * only set the first error encountered. Any additional errors= will + * be recorded in the error information log page. + */ + if (!req->status || (status & 0xfff) =3D=3D NVME_INTERNAL_DEV_= ERROR) { + req->status =3D status; + } + } + } + + if (aio->cb) { + aio->cb(aio, aio->cb_arg, ret); + } + + if (req && QTAILQ_EMPTY(&req->aio_tailq)) { + if (req->cb) { + req->cb(req, req->cb_arg); + } else { + NvmeSQueue *sq =3D req->sq; + NvmeCtrl *n =3D sq->ctrl; + NvmeCQueue *cq =3D n->cq[sq->cqid]; + + nvme_enqueue_req_completion(cq, req); + } + } + + nvme_aio_destroy(aio); +} + static uint16_t nvme_flush(NvmeCtrl *n, NvmeRequest *req) { - block_acct_start(blk_get_stats(n->conf.blk), &req->acct, 0, - BLOCK_ACCT_FLUSH); - req->aiocb =3D blk_aio_flush(n->conf.blk, nvme_rw_cb, req); + NvmeAIO *aio =3D g_new0(NvmeAIO, 1); + + *aio =3D (NvmeAIO) { + .blk =3D n->conf.blk, + .req =3D req, + }; + + nvme_req_add_aio(req, aio, NVME_AIO_OPC_FLUSH); + nvme_submit_aio(aio); =20 return NVME_NO_COMPLETE; } @@ -575,25 +757,38 @@ static uint16_t nvme_write_zeroes(NvmeCtrl *n, NvmeRe= quest *req) { NvmeRwCmd *rw =3D (NvmeRwCmd *)&req->cmd; NvmeNamespace *ns =3D req->ns; - const uint8_t lba_index =3D NVME_ID_NS_FLBAS_INDEX(ns->id_ns.flbas); - const uint8_t data_shift =3D ns->id_ns.lbaf[lba_index].ds; - uint64_t slba =3D le64_to_cpu(rw->slba); - uint32_t nlb =3D le16_to_cpu(rw->nlb) + 1; - uint64_t offset =3D slba << data_shift; - uint32_t count =3D nlb << data_shift; + NvmeAIO *aio; + + int64_t offset; + size_t count; uint16_t status; =20 - trace_nvme_dev_write_zeroes(nvme_cid(req), slba, nlb); =20 - status =3D nvme_check_bounds(n, slba, nlb, req); + req->slba =3D le64_to_cpu(rw->slba); + req->nlb =3D le16_to_cpu(rw->nlb) + 1; + + trace_nvme_dev_write_zeroes(nvme_cid(req), req->slba, req->nlb); + + status =3D nvme_check_bounds(n, req->slba, req->nlb, req); if (status) { return status; } =20 - block_acct_start(blk_get_stats(n->conf.blk), &req->acct, 0, - BLOCK_ACCT_WRITE); - req->aiocb =3D blk_aio_pwrite_zeroes(n->conf.blk, offset, count, - BDRV_REQ_MAY_UNMAP, nvme_rw_cb, re= q); + offset =3D req->slba << nvme_ns_lbads(ns); + count =3D req->nlb << nvme_ns_lbads(ns); + + aio =3D g_new0(NvmeAIO, 1); + + *aio =3D (NvmeAIO) { + .blk =3D n->conf.blk, + .offset =3D offset, + .len =3D count, + .req =3D req, + }; + + nvme_req_add_aio(req, aio, NVME_AIO_OPC_WRITE_ZEROES); + nvme_submit_aio(aio); + return NVME_NO_COMPLETE; } =20 @@ -601,55 +796,43 @@ static uint16_t nvme_rw(NvmeCtrl *n, NvmeRequest *req) { NvmeRwCmd *rw =3D (NvmeRwCmd *)&req->cmd; NvmeNamespace *ns =3D req->ns; - uint32_t nlb =3D le32_to_cpu(rw->nlb) + 1; - uint64_t slba =3D le64_to_cpu(rw->slba); + uint32_t len; + int status; =20 - uint8_t lba_index =3D NVME_ID_NS_FLBAS_INDEX(ns->id_ns.flbas); - uint8_t data_shift =3D ns->id_ns.lbaf[lba_index].ds; - uint64_t data_size =3D (uint64_t)nlb << data_shift; - uint64_t data_offset =3D slba << data_shift; - int is_write =3D rw->opcode =3D=3D NVME_CMD_WRITE ? 1 : 0; - enum BlockAcctType acct =3D is_write ? BLOCK_ACCT_WRITE : BLOCK_ACCT_R= EAD; - uint16_t status; + enum BlockAcctType acct =3D + nvme_req_is_write(req) ? BLOCK_ACCT_WRITE : BLOCK_ACCT_READ; =20 - trace_nvme_dev_rw(is_write ? "write" : "read", nlb, data_size, slba); + req->nlb =3D le16_to_cpu(rw->nlb) + 1; + req->slba =3D le64_to_cpu(rw->slba); =20 - status =3D nvme_check_mdts(n, data_size, req); + len =3D req->nlb << nvme_ns_lbads(ns); + + trace_nvme_dev_rw(nvme_req_is_write(req) ? "write" : "read", req->nlb, + req->nlb << nvme_ns_lbads(req->ns), req->slba); + + status =3D nvme_check_mdts(n, len, req); if (status) { - block_acct_invalid(blk_get_stats(n->conf.blk), acct); - return status; + goto invalid; } =20 - status =3D nvme_check_bounds(n, slba, nlb, req); + status =3D nvme_check_bounds(n, req->slba, req->nlb, req); if (status) { - block_acct_invalid(blk_get_stats(n->conf.blk), acct); - return status; + goto invalid; } =20 - if (nvme_map(n, &req->qsg, &req->iov, data_size, req)) { - block_acct_invalid(blk_get_stats(n->conf.blk), acct); - return NVME_INVALID_FIELD | NVME_DNR; + status =3D nvme_map(n, &req->qsg, &req->iov, len, req); + if (status) { + goto invalid; } =20 - if (req->qsg.nsg > 0) { - block_acct_start(blk_get_stats(n->conf.blk), &req->acct, req->qsg.= size, - acct); - req->aiocb =3D is_write ? - dma_blk_write(n->conf.blk, &req->qsg, data_offset, BDRV_SECTOR= _SIZE, - nvme_rw_cb, req) : - dma_blk_read(n->conf.blk, &req->qsg, data_offset, BDRV_SECTOR_= SIZE, - nvme_rw_cb, req); - } else { - block_acct_start(blk_get_stats(n->conf.blk), &req->acct, req->iov.= size, - acct); - req->aiocb =3D is_write ? - blk_aio_pwritev(n->conf.blk, data_offset, &req->iov, 0, nvme_r= w_cb, - req) : - blk_aio_preadv(n->conf.blk, data_offset, &req->iov, 0, nvme_rw= _cb, - req); - } + nvme_rw_aio(n->conf.blk, req->slba << nvme_ns_lbads(ns), req); + nvme_req_set_cb(req, nvme_rw_cb, NULL); =20 return NVME_NO_COMPLETE; + +invalid: + block_acct_invalid(blk_get_stats(n->conf.blk), acct); + return status; } =20 static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeRequest *req) @@ -696,6 +879,7 @@ static uint16_t nvme_del_sq(NvmeCtrl *n, NvmeRequest *r= eq) NvmeRequest *r, *next; NvmeSQueue *sq; NvmeCQueue *cq; + NvmeAIO *aio; uint16_t qid =3D le16_to_cpu(c->qid); =20 if (unlikely(!qid || nvme_check_sqid(n, qid))) { @@ -708,8 +892,11 @@ static uint16_t nvme_del_sq(NvmeCtrl *n, NvmeRequest *= req) sq =3D n->sq[qid]; while (!QTAILQ_EMPTY(&sq->out_req_list)) { r =3D QTAILQ_FIRST(&sq->out_req_list); - assert(r->aiocb); - blk_aio_cancel(r->aiocb); + while (!QTAILQ_EMPTY(&r->aio_tailq)) { + aio =3D QTAILQ_FIRST(&r->aio_tailq); + assert(aio->aiocb); + blk_aio_cancel(aio->aiocb); + } } if (!nvme_check_cqid(n, sq->cqid)) { cq =3D n->cq[sq->cqid]; @@ -746,6 +933,7 @@ static void nvme_init_sq(NvmeSQueue *sq, NvmeCtrl *n, u= int64_t dma_addr, QTAILQ_INIT(&sq->out_req_list); for (i =3D 0; i < sq->size; i++) { sq->io_req[i].sq =3D sq; + QTAILQ_INIT(&(sq->io_req[i].aio_tailq)); QTAILQ_INSERT_TAIL(&(sq->req_list), &sq->io_req[i], entry); } sq->timer =3D timer_new_ns(QEMU_CLOCK_VIRTUAL, nvme_process_sq, sq); diff --git a/hw/block/nvme.h b/hw/block/nvme.h index a781528a0ccd..ed4a5ce4121d 100644 --- a/hw/block/nvme.h +++ b/hw/block/nvme.h @@ -27,18 +27,41 @@ typedef struct NvmeAsyncEvent { NvmeAerResult result; } NvmeAsyncEvent; =20 -typedef struct NvmeRequest { - struct NvmeSQueue *sq; - struct NvmeNamespace *ns; - BlockAIOCB *aiocb; - uint16_t status; - NvmeCqe cqe; - NvmeCmd cmd; - BlockAcctCookie acct; - QEMUSGList qsg; - QEMUIOVector iov; - QTAILQ_ENTRY(NvmeRequest)entry; -} NvmeRequest; +typedef struct NvmeRequest NvmeRequest; +typedef void NvmeRequestCompletionFunc(NvmeRequest *req, void *opaque); + +struct NvmeRequest { + struct NvmeSQueue *sq; + struct NvmeNamespace *ns; + + NvmeCqe cqe; + NvmeCmd cmd; + uint16_t status; + + uint64_t slba; + uint32_t nlb; + + QEMUSGList qsg; + QEMUIOVector iov; + + NvmeRequestCompletionFunc *cb; + void *cb_arg; + + QTAILQ_HEAD(, NvmeAIO) aio_tailq; + QTAILQ_ENTRY(NvmeRequest) entry; +}; + +static inline void nvme_req_set_cb(NvmeRequest *req, + NvmeRequestCompletionFunc *cb, void *cb= _arg) +{ + req->cb =3D cb; + req->cb_arg =3D cb_arg; +} + +static inline void nvme_req_clear_cb(NvmeRequest *req) +{ + req->cb =3D req->cb_arg =3D NULL; +} =20 typedef struct NvmeSQueue { struct NvmeCtrl *ctrl; @@ -90,6 +113,60 @@ static inline size_t nvme_ns_lbads_bytes(NvmeNamespace = *ns) return 1 << nvme_ns_lbads(ns); } =20 +typedef enum NvmeAIOOp { + NVME_AIO_OPC_NONE =3D 0x0, + NVME_AIO_OPC_FLUSH =3D 0x1, + NVME_AIO_OPC_READ =3D 0x2, + NVME_AIO_OPC_WRITE =3D 0x3, + NVME_AIO_OPC_WRITE_ZEROES =3D 0x4, +} NvmeAIOOp; + +typedef struct NvmeAIO NvmeAIO; +typedef void NvmeAIOCompletionFunc(NvmeAIO *aio, void *opaque, int ret); + +struct NvmeAIO { + NvmeRequest *req; + + NvmeAIOOp opc; + int64_t offset; + size_t len; + BlockBackend *blk; + BlockAIOCB *aiocb; + BlockAcctCookie acct; + + NvmeAIOCompletionFunc *cb; + void *cb_arg; + + QEMUSGList *qsg; + QEMUIOVector *iov; + + QTAILQ_ENTRY(NvmeAIO) tailq_entry; +}; + +static inline const char *nvme_aio_opc_str(NvmeAIO *aio) +{ + switch (aio->opc) { + case NVME_AIO_OPC_NONE: return "NVME_AIO_OP_NONE"; + case NVME_AIO_OPC_FLUSH: return "NVME_AIO_OP_FLUSH"; + case NVME_AIO_OPC_READ: return "NVME_AIO_OP_READ"; + case NVME_AIO_OPC_WRITE: return "NVME_AIO_OP_WRITE"; + case NVME_AIO_OPC_WRITE_ZEROES: return "NVME_AIO_OP_WRITE_ZEROES"; + default: return "NVME_AIO_OP_UNKNOWN"; + } +} + +static inline bool nvme_req_is_write(NvmeRequest *req) +{ + switch (req->cmd.opcode) { + case NVME_CMD_WRITE: + case NVME_CMD_WRITE_UNCOR: + case NVME_CMD_WRITE_ZEROES: + return true; + default: + return false; + } +} + #define TYPE_NVME "nvme" #define NVME(obj) \ OBJECT_CHECK(NvmeCtrl, (obj), TYPE_NVME) diff --git a/hw/block/trace-events b/hw/block/trace-events index 291422a5b77d..7c277a2999c0 100644 --- a/hw/block/trace-events +++ b/hw/block/trace-events @@ -34,6 +34,8 @@ nvme_dev_irq_pin(void) "pulsing IRQ pin" nvme_dev_irq_masked(void) "IRQ is masked" nvme_dev_dma_read(uint64_t prp1, uint64_t prp2) "DMA read, prp1=3D0x%"PRIx= 64" prp2=3D0x%"PRIx64"" nvme_dev_map_prp(uint16_t cid, uint64_t trans_len, uint32_t len, uint64_t = prp1, uint64_t prp2, int num_prps) "cid %"PRIu16" trans_len %"PRIu64" len %= "PRIu32" prp1 0x%"PRIx64" prp2 0x%"PRIx64" num_prps %d" +nvme_dev_req_add_aio(uint16_t cid, void *aio, const char *blkname, uint64_= t offset, uint64_t count, const char *opc, void *req) "cid %"PRIu16" aio %p= blk \"%s\" offset %"PRIu64" count %"PRIu64" opc \"%s\" req %p" +nvme_dev_aio_cb(uint16_t cid, void *aio, const char *blkname, uint64_t off= set, const char *opc, void *req) "cid %"PRIu16" aio %p blk \"%s\" offset %"= PRIu64" opc \"%s\" req %p" nvme_dev_io_cmd(uint16_t cid, uint32_t nsid, uint16_t sqid, uint8_t opcode= ) "cid %"PRIu16" nsid %"PRIu32" sqid %"PRIu16" opc 0x%"PRIx8"" nvme_dev_admin_cmd(uint16_t cid, uint16_t sqid, uint8_t opcode) "cid %"PRI= u16" sqid %"PRIu16" opc 0x%"PRIx8"" nvme_dev_rw(const char *verb, uint32_t blk_count, uint64_t byte_count, uin= t64_t lba) "%s %"PRIu32" blocks (%"PRIu64" bytes) from LBA %"PRIu64"" @@ -84,6 +86,7 @@ nvme_dev_mmio_doorbell_sq(uint16_t sqid, uint16_t new_tai= l) "cqid %"PRIu16" new_ =20 # nvme traces for error conditions nvme_dev_err_mdts(uint16_t cid, size_t mdts, size_t len) "cid %"PRIu16" md= ts %"PRIu64" len %"PRIu64"" +nvme_dev_err_aio(uint16_t cid, void *aio, const char *blkname, uint64_t of= fset, const char *opc, void *req, uint16_t status) "cid %"PRIu16" aio %p bl= k \"%s\" offset %"PRIu64" opc \"%s\" req %p status 0x%"PRIx16"" nvme_dev_err_invalid_dma(void) "PRP/SGL is too small for transfer size" nvme_dev_err_invalid_prplist_ent(uint64_t prplist) "PRP list entry is null= or not page aligned: 0x%"PRIx64"" nvme_dev_err_invalid_prp2_align(uint64_t prp2) "PRP2 is not page aligned: = 0x%"PRIx64"" --=20 2.26.0