From nobody Mon Feb 9 09:16:03 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1612858718; cv=none; d=zohomail.com; s=zohoarc; b=VhLPboXcsJZr9GBbc3qenXjtahw0CVfwEsPM1AHtUBBnvuvbKeZDBOg9v3Fastga141dTChhjjCFTovqUzoF2WY4KI9QA6pawtqwdJMmuc+gyrotibMrkDBhFOqH37UAMqCEiFckCr+Zs2kROn+M2BLXsigGHw6+E7SfFdlQh5o= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1612858718; h=Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=xnDESdD+bm8lDJNMqjcMPRLI+W7S+9/cGJaoXJkwYbs=; b=VsM3r9ka+zuvPxyJ3w57wqe9cZrxJ+BtrIiLYzU5I1mwiDvvHGW7KqOoPRH9ygTm0pRiqz607Lv0ulIAg/baYY4W+KIPQzRNbYqiTvfZ4u5my112ckZe3tgCBgEPrD1P2saj35Yy56+ffg/jIC8PF1Mo3fyPEQ9+k3uT9zhhUIk= ARC-Authentication-Results: i=1; mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1612858718771953.2446137281313; Tue, 9 Feb 2021 00:18:38 -0800 (PST) Received: from localhost ([::1]:51782 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1l9OEH-0004yu-HW for importer@patchew.org; Tue, 09 Feb 2021 03:18:37 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:47238) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1l9NVY-0005J4-PR; Tue, 09 Feb 2021 02:32:24 -0500 Received: from wnew1-smtp.messagingengine.com ([64.147.123.26]:52337) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1l9NVL-0005fN-Ve; Tue, 09 Feb 2021 02:32:22 -0500 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailnew.west.internal (Postfix) with ESMTP id 4E69998E; Tue, 9 Feb 2021 02:31:49 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Tue, 09 Feb 2021 02:31:50 -0500 Received: from apples.local (80-167-98-190-cable.dk.customer.tdc.net [80.167.98.190]) by mail.messagingengine.com (Postfix) with ESMTPA id 3F738108005C; Tue, 9 Feb 2021 02:31:47 -0500 (EST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=irrelevant.dk; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm2; bh=xnDESdD+bm8lD JNMqjcMPRLI+W7S+9/cGJaoXJkwYbs=; b=S8upkLELSQicDswTOu7ZRKIq5rIft IiQxOsVbp+BzFcMGe8LuvCeWFd1iiEO5v0/Ghj+UWHiIphRUmT3R14WSjrhrnQgc z1QBOe4OCCO3Y1rT50VU/uYLaF2JH7ICCm2pMNNgyxrN1Abr67nPEpSmpSf7ocWK m6A1/sWLJo5/iZ/DVQbY8ECLEVDzdJQO/Tgf8+PkckJlVyI5io6jVc8berv3HfhM jJkykZ8Or4cKm9XzI+ao7VOLQ+ZpCKp1raVhZzy1+c0qb4zyYrGkk/kzSCnQB2fl /Y6N+VSTCVhhOQwngkuUpVcmQeziZ3l/9FIcCt0I4J9L06iUCrVofyYnQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=xnDESdD+bm8lDJNMqjcMPRLI+W7S+9/cGJaoXJkwYbs=; b=qhHZnvaa GtQL6+URgYC5mR0Q/NbiNcNKaXtzcMLkbY2EzpKKYAqJDXUFRIHvnRNbd0i31i7G GD8Nq+TfzGgR5pkfkSgyhQLOgAMT67CTPnYUqN3sxdEeR76PxPBtxFjavUHBwmFv 3JjY+SaKu5zyi72RyjsJObvURY+kPqzFp1jOrT8/m01qGdiwUmqEJ8RSmIwzr4ee xt9l9Qmo6xgfo7iSg2ln7ILviu1RJ2X9/j5rdZ49ZqXzQvwiv4DZBTLu6OecxeZn dTqL+4iIvEoD7QB4zA7jdpq99peZgBUDU4EUNf7804HfHsICzRnTkyQLvbOQQV+j a7jTDFij2Ff6Dw== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrheeggddutdelucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhephffvufffkffojghfggfgsedtkeertdertddtnecuhfhrohhmpefmlhgruhhs ucflvghnshgvnhcuoehithhssehirhhrvghlvghvrghnthdrughkqeenucggtffrrghtth gvrhhnpeeuleetgeeiuefhgfekfefgveejiefgteekiedtgfdtieefhfdthfefueffvefg keenucfkphepkedtrdduieejrdelkedrudeltdenucevlhhushhtvghrufhiiigvpedufe enucfrrghrrghmpehmrghilhhfrhhomhepihhtshesihhrrhgvlhgvvhgrnhhtrdgukh X-ME-Proxy: From: Klaus Jensen To: qemu-devel@nongnu.org, Peter Maydell Subject: [PULL 17/56] hw/block/nvme: Introduce max active and open zone limits Date: Tue, 9 Feb 2021 08:30:22 +0100 Message-Id: <20210209073101.548811-18-its@irrelevant.dk> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210209073101.548811-1-its@irrelevant.dk> References: <20210209073101.548811-1-its@irrelevant.dk> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=64.147.123.26; envelope-from=its@irrelevant.dk; helo=wnew1-smtp.messagingengine.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Fam Zheng , Kevin Wolf , qemu-block@nongnu.org, Niklas Cassel , Dmitry Fomichev , Klaus Jensen , Max Reitz , Klaus Jensen , Hans Holmberg , Stefan Hajnoczi , Keith Busch Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Type: text/plain; charset="utf-8" From: Dmitry Fomichev Add two module properties, "zoned.max_active" and "zoned.max_open" to control the maximum number of zones that can be active or open. Once these variables are set to non-default values, these limits are checked during I/O and Too Many Active or Too Many Open command status is returned if they are exceeded. Signed-off-by: Hans Holmberg Signed-off-by: Dmitry Fomichev Reviewed-by: Niklas Cassel Reviewed-by: Keith Busch Signed-off-by: Klaus Jensen --- hw/block/nvme-ns.h | 41 +++++++++++++++++++ hw/block/nvme-ns.c | 31 ++++++++++++++- hw/block/nvme.c | 92 +++++++++++++++++++++++++++++++++++++++++++ hw/block/trace-events | 2 + 4 files changed, 164 insertions(+), 2 deletions(-) diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h index 388381dda0df..7e1fd26909ba 100644 --- a/hw/block/nvme-ns.h +++ b/hw/block/nvme-ns.h @@ -33,6 +33,8 @@ typedef struct NvmeNamespaceParams { bool cross_zone_read; uint64_t zone_size_bs; uint64_t zone_cap_bs; + uint32_t max_active_zones; + uint32_t max_open_zones; } NvmeNamespaceParams; =20 typedef struct NvmeNamespace { @@ -54,6 +56,8 @@ typedef struct NvmeNamespace { uint64_t zone_size; uint64_t zone_capacity; uint32_t zone_size_log2; + int32_t nr_open_zones; + int32_t nr_active_zones; =20 NvmeNamespaceParams params; =20 @@ -125,6 +129,43 @@ static inline bool nvme_wp_is_valid(NvmeZone *zone) st !=3D NVME_ZONE_STATE_OFFLINE; } =20 +static inline void nvme_aor_inc_open(NvmeNamespace *ns) +{ + assert(ns->nr_open_zones >=3D 0); + if (ns->params.max_open_zones) { + ns->nr_open_zones++; + assert(ns->nr_open_zones <=3D ns->params.max_open_zones); + } +} + +static inline void nvme_aor_dec_open(NvmeNamespace *ns) +{ + if (ns->params.max_open_zones) { + assert(ns->nr_open_zones > 0); + ns->nr_open_zones--; + } + assert(ns->nr_open_zones >=3D 0); +} + +static inline void nvme_aor_inc_active(NvmeNamespace *ns) +{ + assert(ns->nr_active_zones >=3D 0); + if (ns->params.max_active_zones) { + ns->nr_active_zones++; + assert(ns->nr_active_zones <=3D ns->params.max_active_zones); + } +} + +static inline void nvme_aor_dec_active(NvmeNamespace *ns) +{ + if (ns->params.max_active_zones) { + assert(ns->nr_active_zones > 0); + ns->nr_active_zones--; + assert(ns->nr_active_zones >=3D ns->nr_open_zones); + } + assert(ns->nr_active_zones >=3D 0); +} + int nvme_ns_setup(NvmeCtrl *n, NvmeNamespace *ns, Error **errp); void nvme_ns_drain(NvmeNamespace *ns); void nvme_ns_shutdown(NvmeNamespace *ns); diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c index d79452c627cf..c55afc1920a3 100644 --- a/hw/block/nvme-ns.c +++ b/hw/block/nvme-ns.c @@ -135,6 +135,21 @@ static int nvme_ns_zoned_check_calc_geometry(NvmeNames= pace *ns, Error **errp) ns->zone_size =3D zone_size / lbasz; ns->zone_capacity =3D zone_cap / lbasz; ns->num_zones =3D ns->size / lbasz / ns->zone_size; + + /* Do a few more sanity checks of ZNS properties */ + if (ns->params.max_open_zones > ns->num_zones) { + error_setg(errp, + "max_open_zones value %u exceeds the number of zones %u= ", + ns->params.max_open_zones, ns->num_zones); + return -1; + } + if (ns->params.max_active_zones > ns->num_zones) { + error_setg(errp, + "max_active_zones value %u exceeds the number of zones = %u", + ns->params.max_active_zones, ns->num_zones); + return -1; + } + return 0; } =20 @@ -182,8 +197,8 @@ static void nvme_ns_init_zoned(NvmeCtrl *n, NvmeNamespa= ce *ns, int lba_index) id_ns_z =3D g_malloc0(sizeof(NvmeIdNsZoned)); =20 /* MAR/MOR are zeroes-based, 0xffffffff means no limit */ - id_ns_z->mar =3D 0xffffffff; - id_ns_z->mor =3D 0xffffffff; + id_ns_z->mar =3D cpu_to_le32(ns->params.max_active_zones - 1); + id_ns_z->mor =3D cpu_to_le32(ns->params.max_open_zones - 1); id_ns_z->zoc =3D 0; id_ns_z->ozcs =3D ns->params.cross_zone_read ? 0x01 : 0x00; =20 @@ -209,6 +224,7 @@ static void nvme_clear_zone(NvmeNamespace *ns, NvmeZone= *zone) trace_pci_nvme_clear_ns_close(state, zone->d.zslba); nvme_set_zone_state(zone, NVME_ZONE_STATE_CLOSED); } + nvme_aor_inc_active(ns); QTAILQ_INSERT_HEAD(&ns->closed_zones, zone, entry); } else { trace_pci_nvme_clear_ns_reset(state, zone->d.zslba); @@ -225,16 +241,23 @@ static void nvme_zoned_ns_shutdown(NvmeNamespace *ns) =20 QTAILQ_FOREACH_SAFE(zone, &ns->closed_zones, entry, next) { QTAILQ_REMOVE(&ns->closed_zones, zone, entry); + nvme_aor_dec_active(ns); nvme_clear_zone(ns, zone); } QTAILQ_FOREACH_SAFE(zone, &ns->imp_open_zones, entry, next) { QTAILQ_REMOVE(&ns->imp_open_zones, zone, entry); + nvme_aor_dec_open(ns); + nvme_aor_dec_active(ns); nvme_clear_zone(ns, zone); } QTAILQ_FOREACH_SAFE(zone, &ns->exp_open_zones, entry, next) { QTAILQ_REMOVE(&ns->exp_open_zones, zone, entry); + nvme_aor_dec_open(ns); + nvme_aor_dec_active(ns); nvme_clear_zone(ns, zone); } + + assert(ns->nr_open_zones =3D=3D 0); } =20 static int nvme_ns_check_constraints(NvmeNamespace *ns, Error **errp) @@ -320,6 +343,10 @@ static Property nvme_ns_props[] =3D { 0), DEFINE_PROP_BOOL("zoned.cross_read", NvmeNamespace, params.cross_zone_read, false), + DEFINE_PROP_UINT32("zoned.max_active", NvmeNamespace, + params.max_active_zones, 0), + DEFINE_PROP_UINT32("zoned.max_open", NvmeNamespace, + params.max_open_zones, 0), DEFINE_PROP_END_OF_LIST(), }; =20 diff --git a/hw/block/nvme.c b/hw/block/nvme.c index be83be3fb5d4..c07dbcd2a809 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -206,6 +206,26 @@ static void nvme_assign_zone_state(NvmeNamespace *ns, = NvmeZone *zone, } } =20 +/* + * Check if we can open a zone without exceeding open/active limits. + * AOR stands for "Active and Open Resources" (see TP 4053 section 2.5). + */ +static int nvme_aor_check(NvmeNamespace *ns, uint32_t act, uint32_t opn) +{ + if (ns->params.max_active_zones !=3D 0 && + ns->nr_active_zones + act > ns->params.max_active_zones) { + trace_pci_nvme_err_insuff_active_res(ns->params.max_active_zones); + return NVME_ZONE_TOO_MANY_ACTIVE | NVME_DNR; + } + if (ns->params.max_open_zones !=3D 0 && + ns->nr_open_zones + opn > ns->params.max_open_zones) { + trace_pci_nvme_err_insuff_open_res(ns->params.max_open_zones); + return NVME_ZONE_TOO_MANY_OPEN | NVME_DNR; + } + + return NVME_SUCCESS; +} + static bool nvme_addr_is_cmb(NvmeCtrl *n, hwaddr addr) { hwaddr low =3D n->ctrl_mem.addr; @@ -1168,6 +1188,40 @@ static uint16_t nvme_check_zone_read(NvmeNamespace *= ns, uint64_t slba, return status; } =20 +static void nvme_auto_transition_zone(NvmeNamespace *ns) +{ + NvmeZone *zone; + + if (ns->params.max_open_zones && + ns->nr_open_zones =3D=3D ns->params.max_open_zones) { + zone =3D QTAILQ_FIRST(&ns->imp_open_zones); + if (zone) { + /* + * Automatically close this implicitly open zone. + */ + QTAILQ_REMOVE(&ns->imp_open_zones, zone, entry); + nvme_aor_dec_open(ns); + nvme_assign_zone_state(ns, zone, NVME_ZONE_STATE_CLOSED); + } + } +} + +static uint16_t nvme_auto_open_zone(NvmeNamespace *ns, NvmeZone *zone) +{ + uint16_t status =3D NVME_SUCCESS; + uint8_t zs =3D nvme_get_zone_state(zone); + + if (zs =3D=3D NVME_ZONE_STATE_EMPTY) { + nvme_auto_transition_zone(ns); + status =3D nvme_aor_check(ns, 1, 1); + } else if (zs =3D=3D NVME_ZONE_STATE_CLOSED) { + nvme_auto_transition_zone(ns); + status =3D nvme_aor_check(ns, 0, 1); + } + + return status; +} + static void nvme_finalize_zoned_write(NvmeNamespace *ns, NvmeRequest *req, bool failed) { @@ -1188,7 +1242,11 @@ static void nvme_finalize_zoned_write(NvmeNamespace = *ns, NvmeRequest *req, switch (nvme_get_zone_state(zone)) { case NVME_ZONE_STATE_IMPLICITLY_OPEN: case NVME_ZONE_STATE_EXPLICITLY_OPEN: + nvme_aor_dec_open(ns); + /* fall through */ case NVME_ZONE_STATE_CLOSED: + nvme_aor_dec_active(ns); + /* fall through */ case NVME_ZONE_STATE_EMPTY: nvme_assign_zone_state(ns, zone, NVME_ZONE_STATE_FULL); /* fall through */ @@ -1215,7 +1273,10 @@ static uint64_t nvme_advance_zone_wp(NvmeNamespace *= ns, NvmeZone *zone, zs =3D nvme_get_zone_state(zone); switch (zs) { case NVME_ZONE_STATE_EMPTY: + nvme_aor_inc_active(ns); + /* fall through */ case NVME_ZONE_STATE_CLOSED: + nvme_aor_inc_open(ns); nvme_assign_zone_state(ns, zone, NVME_ZONE_STATE_IMPLICITLY_OP= EN); } } @@ -1556,6 +1617,11 @@ static uint16_t nvme_do_write(NvmeCtrl *n, NvmeReque= st *req, bool append, goto invalid; } =20 + status =3D nvme_auto_open_zone(ns, zone); + if (status !=3D NVME_SUCCESS) { + goto invalid; + } + if (append) { slba =3D zone->w_ptr; } @@ -1651,9 +1717,26 @@ enum NvmeZoneProcessingMask { static uint16_t nvme_open_zone(NvmeNamespace *ns, NvmeZone *zone, enum NvmeZoneState state) { + uint16_t status; + switch (state) { case NVME_ZONE_STATE_EMPTY: + status =3D nvme_aor_check(ns, 1, 0); + if (status !=3D NVME_SUCCESS) { + return status; + } + nvme_aor_inc_active(ns); + /* fall through */ case NVME_ZONE_STATE_CLOSED: + status =3D nvme_aor_check(ns, 0, 1); + if (status !=3D NVME_SUCCESS) { + if (state =3D=3D NVME_ZONE_STATE_EMPTY) { + nvme_aor_dec_active(ns); + } + return status; + } + nvme_aor_inc_open(ns); + /* fall through */ case NVME_ZONE_STATE_IMPLICITLY_OPEN: nvme_assign_zone_state(ns, zone, NVME_ZONE_STATE_EXPLICITLY_OPEN); /* fall through */ @@ -1670,6 +1753,7 @@ static uint16_t nvme_close_zone(NvmeNamespace *ns, Nv= meZone *zone, switch (state) { case NVME_ZONE_STATE_EXPLICITLY_OPEN: case NVME_ZONE_STATE_IMPLICITLY_OPEN: + nvme_aor_dec_open(ns); nvme_assign_zone_state(ns, zone, NVME_ZONE_STATE_CLOSED); /* fall through */ case NVME_ZONE_STATE_CLOSED: @@ -1685,7 +1769,11 @@ static uint16_t nvme_finish_zone(NvmeNamespace *ns, = NvmeZone *zone, switch (state) { case NVME_ZONE_STATE_EXPLICITLY_OPEN: case NVME_ZONE_STATE_IMPLICITLY_OPEN: + nvme_aor_dec_open(ns); + /* fall through */ case NVME_ZONE_STATE_CLOSED: + nvme_aor_dec_active(ns); + /* fall through */ case NVME_ZONE_STATE_EMPTY: zone->w_ptr =3D nvme_zone_wr_boundary(zone); zone->d.wp =3D zone->w_ptr; @@ -1704,7 +1792,11 @@ static uint16_t nvme_reset_zone(NvmeNamespace *ns, N= vmeZone *zone, switch (state) { case NVME_ZONE_STATE_EXPLICITLY_OPEN: case NVME_ZONE_STATE_IMPLICITLY_OPEN: + nvme_aor_dec_open(ns); + /* fall through */ case NVME_ZONE_STATE_CLOSED: + nvme_aor_dec_active(ns); + /* fall through */ case NVME_ZONE_STATE_FULL: zone->w_ptr =3D zone->d.zslba; zone->d.wp =3D zone->w_ptr; diff --git a/hw/block/trace-events b/hw/block/trace-events index 64366d89366c..bbc10bc7371a 100644 --- a/hw/block/trace-events +++ b/hw/block/trace-events @@ -128,6 +128,8 @@ pci_nvme_err_append_not_at_start(uint64_t slba, uint64_= t zone) "appending at slb pci_nvme_err_zone_write_not_ok(uint64_t slba, uint32_t nlb, uint16_t statu= s) "slba=3D%"PRIu64", nlb=3D%"PRIu32", status=3D0x%"PRIx16"" pci_nvme_err_zone_read_not_ok(uint64_t slba, uint32_t nlb, uint16_t status= ) "slba=3D%"PRIu64", nlb=3D%"PRIu32", status=3D0x%"PRIx16"" pci_nvme_err_append_too_large(uint64_t slba, uint32_t nlb, uint8_t zasl) "= slba=3D%"PRIu64", nlb=3D%"PRIu32", zasl=3D%"PRIu8"" +pci_nvme_err_insuff_active_res(uint32_t max_active) "max_active=3D%"PRIu32= " zone limit exceeded" +pci_nvme_err_insuff_open_res(uint32_t max_open) "max_open=3D%"PRIu32" zone= limit exceeded" pci_nvme_err_invalid_iocsci(uint32_t idx) "unsupported command set combina= tion index %"PRIu32"" pci_nvme_err_invalid_del_sq(uint16_t qid) "invalid submission queue deleti= on, sid=3D%"PRIu16"" pci_nvme_err_invalid_create_sq_cqid(uint16_t cqid) "failed creating submis= sion queue, invalid cqid=3D%"PRIu16"" --=20 2.30.0