From nobody Sat May 18 21:45:39 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1608244266; cv=none; d=zohomail.com; s=zohoarc; b=TNvVw9as649fzN3A3ZVSR2O2LDffHRMsd54H5WYCIkdJH6zbU13183fPhWzCqnDqzMdj/u4x3ugRzrJEqWJCON7CKg/1sM53pvr73iA42yjoX0wNGjFXIWJJBq+nTeBvtwlKZtOwMrgWHbeNq60MVeYHAq/S/e0CbQiujLyLbkY= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1608244266; h=Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=hhw9gLpBdP6/r6Sd+jV/ZS2e3O75A/6Z+R6ZKiPYXBM=; b=fK5rWKXw9fULgsIhZ65o+hw/Y13+JVh3rCTjwihSn2NY1jIBiYyU/sl21Mf2voPRDWhNSTvFGFiCWw50nYD026SxPWeyKUBHJcRDPiV9EJuM/fgIPa6HUjDHj0vuM9b/k5bZ9yEHUbaNx0H9xfyg5kdhfXadsacrwHgFfF5rKPA= ARC-Authentication-Results: i=1; mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1608244265588285.14229723549477; Thu, 17 Dec 2020 14:31:05 -0800 (PST) Received: from localhost ([::1]:55122 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kq0Ub-0003NN-Rd for importer@patchew.org; Thu, 17 Dec 2020 16:07:22 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:36178) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kq0Py-000259-Oa; Thu, 17 Dec 2020 16:02:34 -0500 Received: from wout5-smtp.messagingengine.com ([64.147.123.21]:34785) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kq0Pw-0002C4-1h; Thu, 17 Dec 2020 16:02:34 -0500 Received: from compute2.internal (compute2.nyi.internal [10.202.2.42]) by mailout.west.internal (Postfix) with ESMTP id A21CC5C1; Thu, 17 Dec 2020 16:02:27 -0500 (EST) Received: from mailfrontend1 ([10.202.2.162]) by compute2.internal (MEProxy); Thu, 17 Dec 2020 16:02:28 -0500 Received: from apples.local (80-167-98-190-cable.dk.customer.tdc.net [80.167.98.190]) by mail.messagingengine.com (Postfix) with ESMTPA id E4131240066; Thu, 17 Dec 2020 16:02:24 -0500 (EST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=irrelevant.dk; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=hhw9gLpBdP6/r 6Sd+jV/ZS2e3O75A/6Z+R6ZKiPYXBM=; b=ab2VRbNPm0u8kHGXpeTVo6U55NFBP smF0hoW/V/jgu9+5vKx53sa7gB9CWHlEycfcXEBOvrKVqxq61uYzZwIFGvXT+uIS IiIxnEjJ72uHsoMwDRQEgdwWjHl3Obf3NGi+ztER82JAKDKgQZcx0eRUGFMZebGC VZs0XYNfKM5qmLVsu1JDe/nPGgWQIH6R4U1epdgm6F5X/doMiiH2Rk1o5f+SFz2f Z9QgPU56bmxV67s2shsc7/R2XOu6iwAxKGiJqwzVHS8K3SdUrfIrxVCd0n92lr8L SQzRbyr1ZJThRLdyj+8V8zf9ATUzfowhylWswigIcdTca7OgPaCedQkWQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=hhw9gLpBdP6/r6Sd+jV/ZS2e3O75A/6Z+R6ZKiPYXBM=; b=Kl1NFgWo SKf6y30d2jNh+dDWP2py0QZzOBLOy3pNJCG4ePPRmUjnD5+yVobr0YpvHpAJIrkU nVlEf4Vs3MGKTzGRZp2r/fLNC55Dt7rj2x1FWdPiLaLv7TCnp4UNaA3JdRwsA5jz rmMcVHGIZIvUN0TckPAGru0NojXIoXnAo9FjJjpmnbXyWQuguttacNwtFLVabNUQ b9Hk47DH5FVCJWxn7RlyB2hCCd61eb6SzxP/RExQem/XUYchM2JFDTXRdXXzQq11 nlDhps5NGVOwm0bX95GiOucx2xi05lMdEllTuttnYZPFVItRL2YGZgfCqJ0L4RGW 9UP8kDvk+jfWJA== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrudelgedgudegudcutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd enucfjughrpefhvffufffkofgjfhgggfestdekredtredttdenucfhrhhomhepmfhlrghu shculfgvnhhsvghnuceoihhtshesihhrrhgvlhgvvhgrnhhtrdgukheqnecuggftrfgrth htvghrnhepueelteegieeuhffgkeefgfevjeeigfetkeeitdfgtdeifefhtdfhfeeuffev gfeknecukfhppeektddrudeijedrleekrdduledtnecuvehluhhsthgvrhfuihiivgeptd enucfrrghrrghmpehmrghilhhfrhhomhepihhtshesihhrrhgvlhgvvhgrnhhtrdgukh X-ME-Proxy: From: Klaus Jensen To: qemu-devel@nongnu.org Subject: [PATCH RFC 1/3] nvme: add support for extended LBAs Date: Thu, 17 Dec 2020 22:02:20 +0100 Message-Id: <20201217210222.779619-2-its@irrelevant.dk> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201217210222.779619-1-its@irrelevant.dk> References: <20201217210222.779619-1-its@irrelevant.dk> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=64.147.123.21; envelope-from=its@irrelevant.dk; helo=wout5-smtp.messagingengine.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Fam Zheng , qemu-block@nongnu.org, Klaus Jensen , Gollu Appalanaidu , Max Reitz , Klaus Jensen , Stefan Hajnoczi , Keith Busch Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Type: text/plain; charset="utf-8" From: Gollu Appalanaidu This allows logical blocks to be extended with a number of metadata bytes specified by the new namespace parameter 'ms'. The additional bytes are stored immediately after each logical block. The Deallocated or Unwritten Logical Block Error recovery feature is not supported for namespaces with extended LBAs since the extended logical blocks are not aligned with the blocks of the underlying device and the allocation status of blocks can thus not be detemined by the BDRV_BLOCK_ZERO bdrv_block_status flag. Similary, the DLFEAT field will not report any read behavior for deallocated logical blocks reported. Signed-off-by: Gollu Appalanaidu Signed-off-by: Klaus Jensen --- hw/block/nvme-ns.h | 19 ++++++++++++++++--- hw/block/nvme-ns.c | 21 +++++++++++++++++---- hw/block/nvme.c | 6 ++++-- 3 files changed, 37 insertions(+), 9 deletions(-) diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h index 44bf6271b744..1e621fb130a3 100644 --- a/hw/block/nvme-ns.h +++ b/hw/block/nvme-ns.h @@ -21,6 +21,7 @@ =20 typedef struct NvmeNamespaceParams { uint32_t nsid; + uint16_t ms; } NvmeNamespaceParams; =20 typedef struct NvmeNamespace { @@ -57,18 +58,30 @@ static inline uint8_t nvme_ns_lbads(NvmeNamespace *ns) return nvme_ns_lbaf(ns)->ds; } =20 -/* calculate the number of LBAs that the namespace can accomodate */ -static inline uint64_t nvme_ns_nlbas(NvmeNamespace *ns) +static inline uint16_t nvme_ns_ms(NvmeNamespace *ns) { - return ns->size >> nvme_ns_lbads(ns); + return nvme_ns_lbaf(ns)->ms; } =20 /* convert an LBA to the equivalent in bytes */ static inline size_t nvme_l2b(NvmeNamespace *ns, uint64_t lba) { + if (NVME_ID_NS_FLBAS_EXTENDED(ns->id_ns.flbas)) { + return (lba << nvme_ns_lbads(ns)) + (lba * nvme_ns_ms(ns)); + } + return lba << nvme_ns_lbads(ns); } =20 +/* calculate the number of LBAs that the namespace can accomodate */ +static inline uint64_t nvme_ns_nlbas(NvmeNamespace *ns) +{ + if (NVME_ID_NS_FLBAS_EXTENDED(ns->id_ns.flbas)) { + return ns->size / nvme_l2b(ns, 1); + } + return ns->size >> nvme_ns_lbads(ns); +} + typedef struct NvmeCtrl NvmeCtrl; =20 int nvme_ns_setup(NvmeCtrl *n, NvmeNamespace *ns, Error **errp); diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c index 2d69b5177b51..a9785a12eb13 100644 --- a/hw/block/nvme-ns.c +++ b/hw/block/nvme-ns.c @@ -37,9 +37,24 @@ static int nvme_ns_init(NvmeNamespace *ns, Error **errp) int lba_index =3D NVME_ID_NS_FLBAS_INDEX(ns->id_ns.flbas); int npdg; =20 - ns->id_ns.dlfeat =3D 0x9; + id_ns->dlfeat =3D 0x10; =20 id_ns->lbaf[lba_index].ds =3D 31 - clz32(ns->blkconf.logical_block_siz= e); + id_ns->lbaf[lba_index].ms =3D ns->params.ms; + + /* support DULBE and I/O optimization fields */ + id_ns->nsfeat |=3D 0x10; + + if (!ns->params.ms) { + /* zeroes are guaranteed to be read from deallocated blocks */ + id_ns->dlfeat |=3D 0x1 | 0x8; + + /* support DULBE */ + id_ns->nsfeat |=3D 0x4; + } else { + id_ns->mc =3D 0x1; + id_ns->flbas |=3D 0x10; + } =20 id_ns->nsze =3D cpu_to_le64(nvme_ns_nlbas(ns)); =20 @@ -47,9 +62,6 @@ static int nvme_ns_init(NvmeNamespace *ns, Error **errp) id_ns->ncap =3D id_ns->nsze; id_ns->nuse =3D id_ns->ncap; =20 - /* support DULBE and I/O optimization fields */ - id_ns->nsfeat |=3D (0x4 | 0x10); - npdg =3D ns->blkconf.discard_granularity / ns->blkconf.logical_block_s= ize; =20 if (bdrv_get_info(blk_bs(ns->blkconf.blk), &bdi) >=3D 0 && @@ -150,6 +162,7 @@ static void nvme_ns_realize(DeviceState *dev, Error **e= rrp) static Property nvme_ns_props[] =3D { DEFINE_BLOCK_PROPERTIES(NvmeNamespace, blkconf), DEFINE_PROP_UINT32("nsid", NvmeNamespace, params.nsid, 0), + DEFINE_PROP_UINT16("ms", NvmeNamespace, params.ms, 0), DEFINE_PROP_END_OF_LIST(), }; =20 diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 28416b18a5c0..e4922c37c94d 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -1214,6 +1214,7 @@ static uint16_t nvme_rw(NvmeCtrl *n, NvmeRequest *req) BLOCK_ACCT_WRITE : BLOCK_ACCT_READ; BlockBackend *blk =3D ns->blkconf.blk; uint16_t status; + uint32_t sector_size; =20 trace_pci_nvme_rw(nvme_cid(req), nvme_io_opc_str(rw->opcode), nvme_nsid(ns), nlb, data_size, slba); @@ -1246,12 +1247,13 @@ static uint16_t nvme_rw(NvmeCtrl *n, NvmeRequest *r= eq) =20 block_acct_start(blk_get_stats(blk), &req->acct, data_size, acct); if (req->qsg.sg) { + sector_size =3D nvme_l2b(ns, 1); if (acct =3D=3D BLOCK_ACCT_WRITE) { req->aiocb =3D dma_blk_write(blk, &req->qsg, data_offset, - BDRV_SECTOR_SIZE, nvme_rw_cb, req); + sector_size, nvme_rw_cb, req); } else { req->aiocb =3D dma_blk_read(blk, &req->qsg, data_offset, - BDRV_SECTOR_SIZE, nvme_rw_cb, req); + sector_size, nvme_rw_cb, req); } } else { if (acct =3D=3D BLOCK_ACCT_WRITE) { --=20 2.29.2 From nobody Sat May 18 21:45:39 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1608249074; cv=none; d=zohomail.com; s=zohoarc; b=fEAu0Or19lW6G2pB3dB5MhbFMsIMsTbEhK/pJzG4u5FFoPTZhdCnjNma1Uvdf/gbdUGs3dfmvbbdqWVrERBPInDrkIGg3jfxlotDvmFI7W2fH/SMD7pgtHds/BFFrSA7jnILTyqWOWRcK+YYiu1uLACjyjFOfsIyUaJkvspKDyo= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1608249074; h=Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=sve0KRhX3um9N0gZKqkUcIJmDfL0Xx7xO7yaVxPl5q4=; b=aQCRnjcrSdCM+5gRZLj5GXddymdYY1mKBfUhi1LE3b8Rq8uHsc74xmXRQP8nvPhwuexaYunl3WT3F6HEBAm1hLiIFqtPC7i3rMSGrjvZDuyJ67I7TFRedUrpAWR9LLpzcKAK53LWNfRXSWMvh1x0FqAZks0YAK5oB+Pd5HtmGPA= ARC-Authentication-Results: i=1; mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1608249074494873.1487252872772; Thu, 17 Dec 2020 15:51:14 -0800 (PST) Received: from localhost ([::1]:46014 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kq0gd-0003SC-Q1 for importer@patchew.org; Thu, 17 Dec 2020 16:19:48 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:36190) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kq0Q1-00026F-2h; Thu, 17 Dec 2020 16:02:37 -0500 Received: from wout5-smtp.messagingengine.com ([64.147.123.21]:59903) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kq0Pv-0002C6-MP; Thu, 17 Dec 2020 16:02:36 -0500 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.west.internal (Postfix) with ESMTP id 794F558A; Thu, 17 Dec 2020 16:02:28 -0500 (EST) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Thu, 17 Dec 2020 16:02:29 -0500 Received: from apples.local (80-167-98-190-cable.dk.customer.tdc.net [80.167.98.190]) by mail.messagingengine.com (Postfix) with ESMTPA id 63BE124005B; Thu, 17 Dec 2020 16:02:26 -0500 (EST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=irrelevant.dk; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=sve0KRhX3um9N 0gZKqkUcIJmDfL0Xx7xO7yaVxPl5q4=; b=j65oWQRlsV3zJc6lJ/0Uq0T0JC3Nc YHeamvnyIyj6rnWLMmT9xScRUWwqwUcziZlKAosROmI0QqtRoAuNUPy5+KxXhq3J eeSECxffotMEnObEBKPhszz9sIcVsdEaFWXbrHqLkW84OlvX3SOTx1jHwh55Geto 4Ss3uUsOd4vD+ejWPcqlzOxt1x3UxplxplCF4uBnkKzSPIOBsPlnNH1B8fnu6+eX vvHCc+hELO8nZyUmZugD/XmbPiDLEjYYl9XoTFp9P68H2TpX+fcJSkF9hBK/Ibey XtddrpKQ6gej4WIZ9HHYqy1htJDsQjPvfBZuwS22k2s1jD24FGFYAi/HQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=sve0KRhX3um9N0gZKqkUcIJmDfL0Xx7xO7yaVxPl5q4=; b=o51aCyVX hGVx6seYbX0YxQiAXPXWOfNt1Q06FcDoBvSqlXjUoev5YX/XPHgOWOQMXPkkc69I Ng/3D4OBfOOwvqgxP9Ho/Mh2YzlaG26Z02yYlNO+NvenVZBmuDocXBh4Z1NClsbV 4FxXKK+kYA7zem7reLOokG7y60WhOtHajbp/tNNHkYqgvIB27GFJtHDJGI3v2/Vh GWszQYgA0zQGc4nN+1Yfpnvm8XBhtI0JFSihuyCtteH2TFL6MS5O3ClL5EOAgEeS 6MZdm8EQ437MXxiQ4SoMiVwWdWYGrnDC2jK0uJTK094i1AiFXDvJN9syRQlIWFbN sdTiAgYDQeLb7w== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrudelgedgudegudcutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd enucfjughrpefhvffufffkofgjfhgggfestdekredtredttdenucfhrhhomhepmfhlrghu shculfgvnhhsvghnuceoihhtshesihhrrhgvlhgvvhgrnhhtrdgukheqnecuggftrfgrth htvghrnhepueelteegieeuhffgkeefgfevjeeigfetkeeitdfgtdeifefhtdfhfeeuffev gfeknecukfhppeektddrudeijedrleekrdduledtnecuvehluhhsthgvrhfuihiivgeptd enucfrrghrrghmpehmrghilhhfrhhomhepihhtshesihhrrhgvlhgvvhgrnhhtrdgukh X-ME-Proxy: From: Klaus Jensen To: qemu-devel@nongnu.org Subject: [PATCH RFC 2/3] hw/block/nvme: refactor nvme_dma Date: Thu, 17 Dec 2020 22:02:21 +0100 Message-Id: <20201217210222.779619-3-its@irrelevant.dk> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201217210222.779619-1-its@irrelevant.dk> References: <20201217210222.779619-1-its@irrelevant.dk> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=64.147.123.21; envelope-from=its@irrelevant.dk; helo=wout5-smtp.messagingengine.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Fam Zheng , qemu-block@nongnu.org, Klaus Jensen , Max Reitz , Klaus Jensen , Stefan Hajnoczi , Keith Busch Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Type: text/plain; charset="utf-8" From: Klaus Jensen The nvme_dma function doesn't just do DMA (QEMUSGList-based) memory transfe= rs; it also handles QEMUIOVector copies. Introduce the NvmeTxDirection enum and rename to nvme_tx. Remove mapping of PRPs/SGLs from nvme_tx and assert that they have been mapped previously. This allows more fine-grained use. Also expose nvme_tx_{qsg,iov} versions in preparation for end-to-end data protection support. Add new, better named, helpers, nvme_{c2h,h2c}, that does both PRP/SGL mapping and transfer. Signed-off-by: Klaus Jensen --- hw/block/nvme.c | 133 +++++++++++++++++++++++++++++------------------- 1 file changed, 80 insertions(+), 53 deletions(-) diff --git a/hw/block/nvme.c b/hw/block/nvme.c index e4922c37c94d..8d580c121bcc 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -682,48 +682,86 @@ static uint16_t nvme_map_dptr(NvmeCtrl *n, size_t len= , NvmeRequest *req) } } =20 -static uint16_t nvme_dma(NvmeCtrl *n, uint8_t *ptr, uint32_t len, - DMADirection dir, NvmeRequest *req) +typedef enum NvmeTxDirection { + NVME_TX_DIRECTION_TO_DEVICE =3D 0, + NVME_TX_DIRECTION_FROM_DEVICE =3D 1, +} NvmeTxDirection; + +static uint16_t nvme_tx_qsg(uint8_t *ptr, uint32_t len, QEMUSGList *qsg, + NvmeTxDirection dir) { - uint16_t status =3D NVME_SUCCESS; + uint64_t residual; + + if (dir =3D=3D NVME_TX_DIRECTION_TO_DEVICE) { + residual =3D dma_buf_write(ptr, len, qsg); + } else { + residual =3D dma_buf_read(ptr, len, qsg); + } + + if (unlikely(residual)) { + trace_pci_nvme_err_invalid_dma(); + return NVME_INVALID_FIELD | NVME_DNR; + } + + return NVME_SUCCESS; +} + +static uint16_t nvme_tx_iov(uint8_t *ptr, uint32_t len, QEMUIOVector *iov, + NvmeTxDirection dir) +{ + size_t bytes; + + if (dir =3D=3D NVME_TX_DIRECTION_TO_DEVICE) { + bytes =3D qemu_iovec_to_buf(iov, 0, ptr, len); + } else { + bytes =3D qemu_iovec_from_buf(iov, 0, ptr, len); + } + + if (unlikely(bytes !=3D len)) { + trace_pci_nvme_err_invalid_dma(); + return NVME_INVALID_FIELD | NVME_DNR; + } + + return NVME_SUCCESS; +} + +static uint16_t nvme_tx(NvmeCtrl *n, uint8_t *ptr, uint32_t len, + NvmeTxDirection dir, NvmeRequest *req) +{ + /* assert that exactly one of qsg and iov carries data */ + assert((req->qsg.nsg > 0) !=3D (req->iov.niov > 0)); + + if (req->qsg.nsg > 0) { + return nvme_tx_qsg(ptr, len, &req->qsg, dir); + } + + return nvme_tx_iov(ptr, len, &req->iov, dir); +} + +static inline uint16_t nvme_c2h(NvmeCtrl *n, uint8_t *ptr, uint32_t len, + NvmeRequest *req) +{ + uint16_t status; =20 status =3D nvme_map_dptr(n, len, req); if (status) { return status; } =20 - /* assert that only one of qsg and iov carries data */ - assert((req->qsg.nsg > 0) !=3D (req->iov.niov > 0)); + return nvme_tx(n, ptr, len, NVME_TX_DIRECTION_FROM_DEVICE, req); +} =20 - if (req->qsg.nsg > 0) { - uint64_t residual; +static inline uint16_t nvme_h2c(NvmeCtrl *n, uint8_t *ptr, uint32_t len, + NvmeRequest *req) +{ + uint16_t status; =20 - if (dir =3D=3D DMA_DIRECTION_TO_DEVICE) { - residual =3D dma_buf_write(ptr, len, &req->qsg); - } else { - residual =3D dma_buf_read(ptr, len, &req->qsg); - } - - if (unlikely(residual)) { - trace_pci_nvme_err_invalid_dma(); - status =3D NVME_INVALID_FIELD | NVME_DNR; - } - } else { - size_t bytes; - - if (dir =3D=3D DMA_DIRECTION_TO_DEVICE) { - bytes =3D qemu_iovec_to_buf(&req->iov, 0, ptr, len); - } else { - bytes =3D qemu_iovec_from_buf(&req->iov, 0, ptr, len); - } - - if (unlikely(bytes !=3D len)) { - trace_pci_nvme_err_invalid_dma(); - status =3D NVME_INVALID_FIELD | NVME_DNR; - } + status =3D nvme_map_dptr(n, len, req); + if (status) { + return status; } =20 - return status; + return nvme_tx(n, ptr, len, NVME_TX_DIRECTION_TO_DEVICE, req); } =20 static void nvme_post_cqes(void *opaque) @@ -1025,8 +1063,7 @@ static void nvme_compare_cb(void *opaque, int ret) =20 buf =3D g_malloc(ctx->len); =20 - status =3D nvme_dma(nvme_ctrl(req), buf, ctx->len, DMA_DIRECTION_TO_DE= VICE, - req); + status =3D nvme_h2c(nvme_ctrl(req), buf, ctx->len, req); if (status) { req->status =3D status; goto out; @@ -1062,8 +1099,7 @@ static uint16_t nvme_dsm(NvmeCtrl *n, NvmeRequest *re= q) NvmeDsmRange range[nr]; uintptr_t *discards =3D (uintptr_t *)&req->opaque; =20 - status =3D nvme_dma(n, (uint8_t *)range, sizeof(range), - DMA_DIRECTION_TO_DEVICE, req); + status =3D nvme_h2c(n, (uint8_t *)range, sizeof(range), req); if (status) { return status; } @@ -1498,8 +1534,7 @@ static uint16_t nvme_smart_info(NvmeCtrl *n, uint8_t = rae, uint32_t buf_len, nvme_clear_events(n, NVME_AER_TYPE_SMART); } =20 - return nvme_dma(n, (uint8_t *) &smart + off, trans_len, - DMA_DIRECTION_FROM_DEVICE, req); + return nvme_c2h(n, (uint8_t *) &smart + off, trans_len, req); } =20 static uint16_t nvme_fw_log_info(NvmeCtrl *n, uint32_t buf_len, uint64_t o= ff, @@ -1517,8 +1552,7 @@ static uint16_t nvme_fw_log_info(NvmeCtrl *n, uint32_= t buf_len, uint64_t off, strpadcpy((char *)&fw_log.frs1, sizeof(fw_log.frs1), "1.0", ' '); trans_len =3D MIN(sizeof(fw_log) - off, buf_len); =20 - return nvme_dma(n, (uint8_t *) &fw_log + off, trans_len, - DMA_DIRECTION_FROM_DEVICE, req); + return nvme_c2h(n, (uint8_t *) &fw_log + off, trans_len, req); } =20 static uint16_t nvme_error_info(NvmeCtrl *n, uint8_t rae, uint32_t buf_len, @@ -1538,8 +1572,7 @@ static uint16_t nvme_error_info(NvmeCtrl *n, uint8_t = rae, uint32_t buf_len, memset(&errlog, 0x0, sizeof(errlog)); trans_len =3D MIN(sizeof(errlog) - off, buf_len); =20 - return nvme_dma(n, (uint8_t *)&errlog, trans_len, - DMA_DIRECTION_FROM_DEVICE, req); + return nvme_c2h(n, (uint8_t *)&errlog, trans_len, req); } =20 static uint16_t nvme_get_log(NvmeCtrl *n, NvmeRequest *req) @@ -1702,8 +1735,7 @@ static uint16_t nvme_identify_ctrl(NvmeCtrl *n, NvmeR= equest *req) { trace_pci_nvme_identify_ctrl(); =20 - return nvme_dma(n, (uint8_t *)&n->id_ctrl, sizeof(n->id_ctrl), - DMA_DIRECTION_FROM_DEVICE, req); + return nvme_c2h(n, (uint8_t *)&n->id_ctrl, sizeof(n->id_ctrl), req); } =20 static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeRequest *req) @@ -1726,8 +1758,7 @@ static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeReq= uest *req) id_ns =3D &ns->id_ns; } =20 - return nvme_dma(n, (uint8_t *)id_ns, sizeof(NvmeIdNs), - DMA_DIRECTION_FROM_DEVICE, req); + return nvme_c2h(n, (uint8_t *)id_ns, sizeof(NvmeIdNs), req); } =20 static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeRequest *req) @@ -1761,8 +1792,7 @@ static uint16_t nvme_identify_nslist(NvmeCtrl *n, Nvm= eRequest *req) break; } } - ret =3D nvme_dma(n, (uint8_t *)list, data_len, DMA_DIRECTION_FROM_DEVI= CE, - req); + ret =3D nvme_c2h(n, (uint8_t *)list, data_len, req); g_free(list); return ret; } @@ -1804,8 +1834,7 @@ static uint16_t nvme_identify_ns_descr_list(NvmeCtrl = *n, NvmeRequest *req) ns_descrs->uuid.hdr.nidl =3D NVME_NIDT_UUID_LEN; stl_be_p(&ns_descrs->uuid.v, nsid); =20 - return nvme_dma(n, list, NVME_IDENTIFY_DATA_SIZE, - DMA_DIRECTION_FROM_DEVICE, req); + return nvme_c2h(n, list, NVME_IDENTIFY_DATA_SIZE, req); } =20 static uint16_t nvme_identify(NvmeCtrl *n, NvmeRequest *req) @@ -1878,8 +1907,7 @@ static uint16_t nvme_get_feature_timestamp(NvmeCtrl *= n, NvmeRequest *req) { uint64_t timestamp =3D nvme_get_timestamp(n); =20 - return nvme_dma(n, (uint8_t *)×tamp, sizeof(timestamp), - DMA_DIRECTION_FROM_DEVICE, req); + return nvme_c2h(n, (uint8_t *)×tamp, sizeof(timestamp), req); } =20 static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeRequest *req) @@ -2026,8 +2054,7 @@ static uint16_t nvme_set_feature_timestamp(NvmeCtrl *= n, NvmeRequest *req) uint16_t ret; uint64_t timestamp; =20 - ret =3D nvme_dma(n, (uint8_t *)×tamp, sizeof(timestamp), - DMA_DIRECTION_TO_DEVICE, req); + ret =3D nvme_h2c(n, (uint8_t *)×tamp, sizeof(timestamp), req); if (ret !=3D NVME_SUCCESS) { return ret; } --=20 2.29.2 From nobody Sat May 18 21:45:39 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1608249165; cv=none; d=zohomail.com; s=zohoarc; b=gcZAzhT4BWe9OIdai/MtAFiN0e5IYcjnGdBW6UptxXECyZvN4KnJ2ljORn96Oq0WfJHYyJBm8EtClFbkJn+BAD20OpOW7FH6PY4pvJSlOwOF3emnOIgdlMIJlL3KY4dGk56iLv5k5xIvHC8bZ7m43hr22SobGwleBNp1pBdW68c= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1608249165; h=Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=BX2e0Wp+MlEL1n2HYuh/vceVCaIgtr14nGxbl0HHHGI=; b=PAo7kJ4enRGyE6ZGilSJf+tqNWIQVmd4XlCy16mJg2MvQ9zTApCDqwgFgExF4J00FjLdrzu/EEgT1n5498S++YHngfzTv4O18zD/ac+t/itvTPMgKng7RM9b08tO/8FmK46z3RWxaaa3340cRAkVqoX3E6P9b2LkoN3Mel06ZHQ= ARC-Authentication-Results: i=1; mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 160824916556799.60720896498583; Thu, 17 Dec 2020 15:52:45 -0800 (PST) Received: from localhost ([::1]:35434 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kq0aI-0007H7-Ut for importer@patchew.org; Thu, 17 Dec 2020 16:13:15 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:36188) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kq0Q0-00025m-9B; Thu, 17 Dec 2020 16:02:36 -0500 Received: from wout5-smtp.messagingengine.com ([64.147.123.21]:35499) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kq0Pw-0002DL-6h; Thu, 17 Dec 2020 16:02:36 -0500 Received: from compute7.internal (compute7.nyi.internal [10.202.2.47]) by mailout.west.internal (Postfix) with ESMTP id 0FCBD5CD; Thu, 17 Dec 2020 16:02:30 -0500 (EST) Received: from mailfrontend1 ([10.202.2.162]) by compute7.internal (MEProxy); Thu, 17 Dec 2020 16:02:30 -0500 Received: from apples.local (80-167-98-190-cable.dk.customer.tdc.net [80.167.98.190]) by mail.messagingengine.com (Postfix) with ESMTPA id C0888240064; Thu, 17 Dec 2020 16:02:27 -0500 (EST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=irrelevant.dk; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=BX2e0Wp+MlEL1 n2HYuh/vceVCaIgtr14nGxbl0HHHGI=; b=AXBZCSvtqodK/qay/EFUSIyV5ZVto v9eNkihGj9fWEi9DKGnng2LwSovzHkXrB7ck1Xl3RcczrPEk8UIMTrPjPKJPRmRA D8ibybDEPUor/Cl7xyeFb75VKyncN37iLAnaCnTGXVppAEs+sEcbXylUUxNT1uSa THS1B4IvQl1be6bmWAC3jX6kRvKXqyRYU52NyE3MuCDELwIADRh9Lkw735FzNhrY lqDomB+8jRq1muHdUktoe+sJR6SKqr/fzOTToIiW5b+kH9MxRAxK0dl527Y5VT2Y wioHcTYOTS7ivYyH0I+vgpU/UQStJLvQChg7mwKo7vVcLjxr31dXRLM1Q== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=BX2e0Wp+MlEL1n2HYuh/vceVCaIgtr14nGxbl0HHHGI=; b=eTu3bON2 /h/WzpN/sQHB6Y5vimBXaM+pjszV1jMkDKnwddWnhMMLJmyNz61U69iXfxZvVGtd jDt63LmQjEcx2/29Ea0gCmrBY60kpCEGG7szP6ABAHvsWhJdDJE4dN1cj2FSE9qZ Uxg3cDEElYNWwdY4AVcU4ADqvvilJm1r1EsEEQZLiGoJfIf8SNt3k8f89kr4C1KX shJcqQbaVGkwZssowcm2JQQjaIlsN2GM03fMT8Xu1GCJYfXDRE+iJYr5Tkvrcm4U k13hdekcY1x5BDqywyHv3vVFHd+qFgrhQvtJKBPQaQ31uSI9iyqVeO2ePQ/2bEoB 9AJzk4BNPeQOVg== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrudelgedgudegudcutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd enucfjughrpefhvffufffkofgjfhgggfestdekredtredttdenucfhrhhomhepmfhlrghu shculfgvnhhsvghnuceoihhtshesihhrrhgvlhgvvhgrnhhtrdgukheqnecuggftrfgrth htvghrnhepueelteegieeuhffgkeefgfevjeeigfetkeeitdfgtdeifefhtdfhfeeuffev gfeknecukfhppeektddrudeijedrleekrdduledtnecuvehluhhsthgvrhfuihiivgeptd enucfrrghrrghmpehmrghilhhfrhhomhepihhtshesihhrrhgvlhgvvhgrnhhtrdgukh X-ME-Proxy: From: Klaus Jensen To: qemu-devel@nongnu.org Subject: [PATCH RFC 3/3] hw/block/nvme: end-to-end data protection Date: Thu, 17 Dec 2020 22:02:22 +0100 Message-Id: <20201217210222.779619-4-its@irrelevant.dk> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201217210222.779619-1-its@irrelevant.dk> References: <20201217210222.779619-1-its@irrelevant.dk> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=64.147.123.21; envelope-from=its@irrelevant.dk; helo=wout5-smtp.messagingengine.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Fam Zheng , qemu-block@nongnu.org, Klaus Jensen , Gollu Appalanaidu , Max Reitz , Klaus Jensen , Stefan Hajnoczi , Keith Busch Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Type: text/plain; charset="utf-8" From: Gollu Appalanaidu Add support for namespaces formatted with protection information in the form of the Data Integrity Field (DIF) where the protection information is contiguous with the logical block data (extended logical blocks). The type of end-to-end data protection (i.e. Type 1, Type 2 or Type 3) is selected with the `pi` nvme-ns device parameter. By defalt, the 8 bytes of protection information is transferred as the last eight bytes of the metadata. The `pil` nvme-ns device parameter can be set to 1 to store this in the first eight bytes instead. With extended logical blocks, there is no way of reliably determining that a block is deallocated or unwritten so this implementation requires the Application and Reference Tag field values to be initialized to 0xffff and 0xffffffff respectively, indicating that the protection information shall not be checked. To instruct the device to perform this initialization, the `pi_init` boolean nvme-ns device parameter is used. The interleaved memory transfer function and using the T10 DIF CRC lookup table from the Linux kernel is resurrected ideas from Keith's old dev tree. Signed-off-by: Gollu Appalanaidu Signed-off-by: Klaus Jensen --- hw/block/nvme-ns.h | 3 + hw/block/nvme.h | 36 ++++ include/block/nvme.h | 24 ++- hw/block/nvme-ns.c | 45 ++++ hw/block/nvme.c | 477 +++++++++++++++++++++++++++++++++++++++++- hw/block/trace-events | 10 + 6 files changed, 587 insertions(+), 8 deletions(-) diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h index 1e621fb130a3..5cd39c859472 100644 --- a/hw/block/nvme-ns.h +++ b/hw/block/nvme-ns.h @@ -22,6 +22,9 @@ typedef struct NvmeNamespaceParams { uint32_t nsid; uint16_t ms; + uint8_t pi; + uint8_t pil; + bool pi_init; } NvmeNamespaceParams; =20 typedef struct NvmeNamespace { diff --git a/hw/block/nvme.h b/hw/block/nvme.h index 574333caa3f9..38f7609207b3 100644 --- a/hw/block/nvme.h +++ b/hw/block/nvme.h @@ -182,6 +182,42 @@ static inline NvmeCtrl *nvme_ctrl(NvmeRequest *req) return sq->ctrl; } =20 +/* from Linux kernel (crypto/crct10dif_common.c) */ +static const uint16_t t10_dif_crc_table[256] =3D { + 0x0000, 0x8BB7, 0x9CD9, 0x176E, 0xB205, 0x39B2, 0x2EDC, 0xA56B, + 0xEFBD, 0x640A, 0x7364, 0xF8D3, 0x5DB8, 0xD60F, 0xC161, 0x4AD6, + 0x54CD, 0xDF7A, 0xC814, 0x43A3, 0xE6C8, 0x6D7F, 0x7A11, 0xF1A6, + 0xBB70, 0x30C7, 0x27A9, 0xAC1E, 0x0975, 0x82C2, 0x95AC, 0x1E1B, + 0xA99A, 0x222D, 0x3543, 0xBEF4, 0x1B9F, 0x9028, 0x8746, 0x0CF1, + 0x4627, 0xCD90, 0xDAFE, 0x5149, 0xF422, 0x7F95, 0x68FB, 0xE34C, + 0xFD57, 0x76E0, 0x618E, 0xEA39, 0x4F52, 0xC4E5, 0xD38B, 0x583C, + 0x12EA, 0x995D, 0x8E33, 0x0584, 0xA0EF, 0x2B58, 0x3C36, 0xB781, + 0xD883, 0x5334, 0x445A, 0xCFED, 0x6A86, 0xE131, 0xF65F, 0x7DE8, + 0x373E, 0xBC89, 0xABE7, 0x2050, 0x853B, 0x0E8C, 0x19E2, 0x9255, + 0x8C4E, 0x07F9, 0x1097, 0x9B20, 0x3E4B, 0xB5FC, 0xA292, 0x2925, + 0x63F3, 0xE844, 0xFF2A, 0x749D, 0xD1F6, 0x5A41, 0x4D2F, 0xC698, + 0x7119, 0xFAAE, 0xEDC0, 0x6677, 0xC31C, 0x48AB, 0x5FC5, 0xD472, + 0x9EA4, 0x1513, 0x027D, 0x89CA, 0x2CA1, 0xA716, 0xB078, 0x3BCF, + 0x25D4, 0xAE63, 0xB90D, 0x32BA, 0x97D1, 0x1C66, 0x0B08, 0x80BF, + 0xCA69, 0x41DE, 0x56B0, 0xDD07, 0x786C, 0xF3DB, 0xE4B5, 0x6F02, + 0x3AB1, 0xB106, 0xA668, 0x2DDF, 0x88B4, 0x0303, 0x146D, 0x9FDA, + 0xD50C, 0x5EBB, 0x49D5, 0xC262, 0x6709, 0xECBE, 0xFBD0, 0x7067, + 0x6E7C, 0xE5CB, 0xF2A5, 0x7912, 0xDC79, 0x57CE, 0x40A0, 0xCB17, + 0x81C1, 0x0A76, 0x1D18, 0x96AF, 0x33C4, 0xB873, 0xAF1D, 0x24AA, + 0x932B, 0x189C, 0x0FF2, 0x8445, 0x212E, 0xAA99, 0xBDF7, 0x3640, + 0x7C96, 0xF721, 0xE04F, 0x6BF8, 0xCE93, 0x4524, 0x524A, 0xD9FD, + 0xC7E6, 0x4C51, 0x5B3F, 0xD088, 0x75E3, 0xFE54, 0xE93A, 0x628D, + 0x285B, 0xA3EC, 0xB482, 0x3F35, 0x9A5E, 0x11E9, 0x0687, 0x8D30, + 0xE232, 0x6985, 0x7EEB, 0xF55C, 0x5037, 0xDB80, 0xCCEE, 0x4759, + 0x0D8F, 0x8638, 0x9156, 0x1AE1, 0xBF8A, 0x343D, 0x2353, 0xA8E4, + 0xB6FF, 0x3D48, 0x2A26, 0xA191, 0x04FA, 0x8F4D, 0x9823, 0x1394, + 0x5942, 0xD2F5, 0xC59B, 0x4E2C, 0xEB47, 0x60F0, 0x779E, 0xFC29, + 0x4BA8, 0xC01F, 0xD771, 0x5CC6, 0xF9AD, 0x721A, 0x6574, 0xEEC3, + 0xA415, 0x2FA2, 0x38CC, 0xB37B, 0x1610, 0x9DA7, 0x8AC9, 0x017E, + 0x1F65, 0x94D2, 0x83BC, 0x080B, 0xAD60, 0x26D7, 0x31B9, 0xBA0E, + 0xF0D8, 0x7B6F, 0x6C01, 0xE7B6, 0x42DD, 0xC96A, 0xDE04, 0x55B3 +}; + int nvme_register_namespace(NvmeCtrl *n, NvmeNamespace *ns, Error **errp); =20 #endif /* HW_NVME_H */ diff --git a/include/block/nvme.h b/include/block/nvme.h index 11ac1c2b7dfb..8888eb041ac0 100644 --- a/include/block/nvme.h +++ b/include/block/nvme.h @@ -583,8 +583,11 @@ enum { NVME_RW_PRINFO_PRCHK_GUARD =3D 1 << 12, NVME_RW_PRINFO_PRCHK_APP =3D 1 << 11, NVME_RW_PRINFO_PRCHK_REF =3D 1 << 10, + NVME_RW_PRINFO_PRCHK_MASK =3D 7 << 10, }; =20 +#define NVME_RW_PRINFO(control) ((control >> 10) & 0xf) + typedef struct QEMU_PACKED NvmeDsmCmd { uint8_t opcode; uint8_t flags; @@ -1051,14 +1054,22 @@ enum NvmeNsIdentifierType { #define NVME_ID_NS_DPC_TYPE_MASK 0x7 =20 enum NvmeIdNsDps { - DPS_TYPE_NONE =3D 0, - DPS_TYPE_1 =3D 1, - DPS_TYPE_2 =3D 2, - DPS_TYPE_3 =3D 3, - DPS_TYPE_MASK =3D 0x7, - DPS_FIRST_EIGHT =3D 8, + NVME_ID_NS_DPS_TYPE_NONE =3D 0, + NVME_ID_NS_DPS_TYPE_1 =3D 1, + NVME_ID_NS_DPS_TYPE_2 =3D 2, + NVME_ID_NS_DPS_TYPE_3 =3D 3, + NVME_ID_NS_DPS_TYPE_MASK =3D 0x7, + NVME_ID_NS_DPS_FIRST_EIGHT =3D 8, }; =20 +#define NVME_ID_NS_DPS_TYPE(dps) (dps & NVME_ID_NS_DPS_TYPE_MASK) + +typedef struct NvmeDifTuple { + uint16_t guard; + uint16_t apptag; + uint32_t reftag; +} NvmeDifTuple; + static inline void _nvme_check_size(void) { QEMU_BUILD_BUG_ON(sizeof(NvmeBar) !=3D 4096); @@ -1080,5 +1091,6 @@ static inline void _nvme_check_size(void) QEMU_BUILD_BUG_ON(sizeof(NvmeIdNs) !=3D 4096); QEMU_BUILD_BUG_ON(sizeof(NvmeSglDescriptor) !=3D 16); QEMU_BUILD_BUG_ON(sizeof(NvmeIdNsDescr) !=3D 4); + QEMU_BUILD_BUG_ON(sizeof(NvmeDifTuple) !=3D 8); } #endif diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c index a9785a12eb13..0e519d42272c 100644 --- a/hw/block/nvme-ns.c +++ b/hw/block/nvme-ns.c @@ -25,11 +25,44 @@ #include "hw/qdev-properties.h" #include "hw/qdev-core.h" =20 +#include "trace.h" + #include "nvme.h" #include "nvme-ns.h" =20 #define MIN_DISCARD_GRANULARITY (4 * KiB) =20 +static int nvme_ns_init_pi(NvmeNamespace *ns, Error **errp) +{ + int nlbas =3D nvme_ns_nlbas(ns); + uint16_t pil =3D ns->id_ns.dps & NVME_ID_NS_DPS_FIRST_EIGHT ? + 0 : nvme_ns_ms(ns) - sizeof(NvmeDifTuple); + int64_t offset =3D 1 << nvme_ns_lbads(ns), stride =3D nvme_l2b(ns, 1); + int i, ret; + + NvmeDifTuple dif =3D { + .apptag =3D 0xffff, + .reftag =3D 0xffffffff, + }; + + for (i =3D 0; i < nlbas; i++) { + if (i && i % 0x1000 =3D=3D 0) { + trace_pci_nvme_ns_init_pi(i, nlbas); + } + + ret =3D blk_pwrite(ns->blkconf.blk, i * stride + offset + pil, &di= f, sizeof(dif), + 0); + if (ret < 0) { + error_setg_errno(errp, -ret, "could not write"); + return -1; + } + } + + trace_pci_nvme_ns_init_pi(nlbas, nlbas); + + return 0; +} + static int nvme_ns_init(NvmeNamespace *ns, Error **errp) { BlockDriverInfo bdi; @@ -54,6 +87,15 @@ static int nvme_ns_init(NvmeNamespace *ns, Error **errp) } else { id_ns->mc =3D 0x1; id_ns->flbas |=3D 0x10; + + id_ns->dpc =3D 0x1f; + id_ns->dps =3D (ns->params.pil << 3) | ns->params.pi; + + if (ns->params.pi_init) { + if (nvme_ns_init_pi(ns, errp)) { + return -1; + } + } } =20 id_ns->nsze =3D cpu_to_le64(nvme_ns_nlbas(ns)); @@ -163,6 +205,9 @@ static Property nvme_ns_props[] =3D { DEFINE_BLOCK_PROPERTIES(NvmeNamespace, blkconf), DEFINE_PROP_UINT32("nsid", NvmeNamespace, params.nsid, 0), DEFINE_PROP_UINT16("ms", NvmeNamespace, params.ms, 0), + DEFINE_PROP_UINT8("pi", NvmeNamespace, params.pi, 0), + DEFINE_PROP_UINT8("pil", NvmeNamespace, params.pil, 0), + DEFINE_PROP_BOOL("pi_init", NvmeNamespace, params.pi_init, false), DEFINE_PROP_END_OF_LIST(), }; =20 diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 8d580c121bcc..c60d24704b96 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -158,6 +158,22 @@ static int nvme_addr_read(NvmeCtrl *n, hwaddr addr, vo= id *buf, int size) return pci_dma_read(&n->parent_obj, addr, buf, size); } =20 +static int nvme_addr_write(NvmeCtrl *n, hwaddr addr, void *buf, int size) +{ + hwaddr hi =3D addr + size - 1; + if (hi < addr) { + return 1; + } + + if (n->bar.cmbsz && nvme_addr_is_cmb(n, addr) && nvme_addr_is_cmb(n, h= i)) { + memcpy(nvme_addr_to_cmb(n, addr), buf, size); + return 0; + } + + return pci_dma_write(&n->parent_obj, addr, buf, size); +} + + static bool nvme_nsid_valid(NvmeCtrl *n, uint32_t nsid) { return nsid && (nsid =3D=3D NVME_NSID_BROADCAST || nsid <=3D n->num_na= mespaces); @@ -725,6 +741,60 @@ static uint16_t nvme_tx_iov(uint8_t *ptr, uint32_t len= , QEMUIOVector *iov, return NVME_SUCCESS; } =20 +static uint16_t nvme_tx_interleaved(NvmeCtrl *n, uint8_t *ptr, uint32_t le= n, + uint32_t bytes, uint16_t skip_bytes, + NvmeTxDirection dir, NvmeRequest *req) +{ + hwaddr addr; + int i =3D 0; + int64_t offset =3D 0; + uint32_t trans_len, count =3D bytes; + + /* assert that exactly one of qsg and iov carries data */ + assert((req->qsg.nsg > 0) !=3D (req->iov.niov > 0)); + + while (len) { + trans_len =3D MIN(len, count); + + if (req->qsg.nsg > 0) { + trans_len =3D MIN(trans_len, req->qsg.sg[i].len - offset); + addr =3D req->qsg.sg[i].base + offset; + } else { + trans_len =3D MIN(trans_len, req->iov.iov[i].iov_len - offset); + addr =3D (hwaddr)req->iov.iov[i].iov_base + offset; + } + + if (dir =3D=3D NVME_TX_DIRECTION_TO_DEVICE) { + if (nvme_addr_read(n, addr, ptr, trans_len)) { + return NVME_DATA_TRAS_ERROR; + } + } else { + if (nvme_addr_write(n, addr, ptr, trans_len)) { + return NVME_DATA_TRAS_ERROR; + } + } + + ptr +=3D trans_len; + len -=3D trans_len; + count -=3D trans_len; + offset +=3D trans_len; + + if (count =3D=3D 0) { + count =3D bytes; + ptr +=3D skip_bytes; + len -=3D skip_bytes; + } + + if ((req->qsg.nsg > 0 && offset =3D=3D req->qsg.sg[i].len) || + (req->iov.niov > 0 && offset =3D=3D req->iov.iov[i].iov_len)) { + offset =3D 0; + i++; + } + } + + return NVME_SUCCESS; +} + static uint16_t nvme_tx(NvmeCtrl *n, uint8_t *ptr, uint32_t len, NvmeTxDirection dir, NvmeRequest *req) { @@ -961,6 +1031,143 @@ static uint16_t nvme_check_dulbe(NvmeNamespace *ns, = uint64_t slba, return NVME_SUCCESS; } =20 +static uint16_t nvme_check_prinfo(NvmeNamespace *ns, uint16_t ctrl, + uint64_t slba, uint32_t reftag) +{ + if ((NVME_ID_NS_DPS_TYPE(ns->id_ns.dps) =3D=3D NVME_ID_NS_DPS_TYPE_1) = && + (slba & 0xffffffff) !=3D reftag) { + return NVME_INVALID_PROT_INFO | NVME_DNR; + } + + return NVME_SUCCESS; +} + +/* from Linux kernel (crypto/crct10dif_common.c) */ +static uint16_t crc_t10dif(const unsigned char *buffer, size_t len) +{ + uint16_t crc =3D 0; + unsigned int i; + + for (i =3D 0; i < len; i++) { + crc =3D (crc << 8) ^ t10_dif_crc_table[((crc >> 8) ^ buffer[i]) & = 0xff]; + } + + return crc; +} + +static void nvme_e2e_pract_generate_dif(NvmeNamespace *ns, uint8_t *buf, + size_t len, uint16_t apptag, + uint32_t reftag) +{ + uint8_t *end =3D buf + len; + size_t lba_size =3D nvme_l2b(ns, 1); + size_t chksum_len =3D 1 << nvme_ns_lbads(ns); + + if (!(ns->id_ns.dps & NVME_ID_NS_DPS_FIRST_EIGHT)) { + chksum_len +=3D nvme_ns_ms(ns) - sizeof(NvmeDifTuple); + } + + trace_pci_nvme_e2e_pract_generate_dif(len, lba_size, chksum_len, appta= g, + reftag); + + for (; buf < end; buf +=3D lba_size) { + NvmeDifTuple *dif =3D (NvmeDifTuple *)(buf + chksum_len); + + dif->guard =3D cpu_to_be16(crc_t10dif(buf, chksum_len)); + dif->apptag =3D cpu_to_be32(apptag); + dif->reftag =3D cpu_to_be32(reftag); + + if (NVME_ID_NS_DPS_TYPE(ns->id_ns.dps) !=3D NVME_ID_NS_DPS_TYPE_3)= { + reftag++; + } + } +} + +static uint16_t nvme_e2e_prchk(NvmeNamespace *ns, NvmeDifTuple *dif, + uint8_t *buf, size_t len, uint16_t ctrl, + uint16_t apptag, uint16_t appmask, + uint32_t reftag) +{ + switch (NVME_ID_NS_DPS_TYPE(ns->id_ns.dps)) { + case NVME_ID_NS_DPS_TYPE_3: + if (be32_to_cpu(dif->reftag) !=3D 0xffffffff) { + break; + } + + /* fallthrough */ + case NVME_ID_NS_DPS_TYPE_1: + case NVME_ID_NS_DPS_TYPE_2: + if (be16_to_cpu(dif->apptag) !=3D 0xffff) { + break; + } + + trace_pci_nvme_e2e_prchk_disabled(be16_to_cpu(dif->apptag), + be32_to_cpu(dif->reftag)); + + return NVME_SUCCESS; + } + + if (ctrl & NVME_RW_PRINFO_PRCHK_GUARD) { + uint16_t crc =3D crc_t10dif(buf, len); + trace_pci_nvme_e2e_prchk_guard(be16_to_cpu(dif->guard), crc); + + if (be16_to_cpu(dif->guard) !=3D crc) { + return NVME_E2E_GUARD_ERROR; + } + } + + if (ctrl & NVME_RW_PRINFO_PRCHK_APP) { + trace_pci_nvme_e2e_prchk_apptag(be16_to_cpu(dif->apptag), apptag, + appmask); + + if ((be16_to_cpu(dif->apptag) & appmask) !=3D (apptag & appmask)) { + return NVME_E2E_APP_ERROR; + } + } + + if (ctrl & NVME_RW_PRINFO_PRCHK_REF) { + trace_pci_nvme_e2e_prchk_reftag(be32_to_cpu(dif->reftag), reftag); + + if (be32_to_cpu(dif->reftag) !=3D reftag) { + return NVME_E2E_REF_ERROR; + } + } + + return NVME_SUCCESS; +} + +static uint16_t nvme_e2e_check(NvmeNamespace *ns, uint8_t *buf, size_t len, + uint16_t ctrl, uint16_t apptag, + uint16_t appmask, uint32_t reftag) +{ + uint8_t *end =3D buf + len; + size_t lba_size =3D nvme_l2b(ns, 1); + size_t chksum_len =3D 1 << nvme_ns_lbads(ns); + uint16_t status; + + if (!(ns->id_ns.dps & NVME_ID_NS_DPS_FIRST_EIGHT)) { + chksum_len +=3D nvme_ns_ms(ns) - sizeof(NvmeDifTuple); + } + + trace_pci_nvme_e2e_check(NVME_RW_PRINFO(ctrl), chksum_len); + + for (; buf < end; buf +=3D lba_size) { + NvmeDifTuple *dif =3D (NvmeDifTuple *)(buf + chksum_len); + + status =3D nvme_e2e_prchk(ns, dif, buf, chksum_len, ctrl, apptag, + appmask, reftag); + if (status) { + return status; + } + + if (NVME_ID_NS_DPS_TYPE(ns->id_ns.dps) !=3D NVME_ID_NS_DPS_TYPE_3)= { + reftag++; + } + } + + return NVME_SUCCESS; +} + static void nvme_aio_err(NvmeRequest *req, int ret) { uint16_t status =3D NVME_SUCCESS; @@ -980,7 +1187,7 @@ static void nvme_aio_err(NvmeRequest *req, int ret) break; } =20 - trace_pci_nvme_err_aio(nvme_cid(req), strerror(ret), status); + trace_pci_nvme_err_aio(nvme_cid(req), strerror(-ret), status); =20 error_setg_errno(&local_err, -ret, "aio failed"); error_report_err(local_err); @@ -1017,6 +1224,73 @@ static void nvme_rw_cb(void *opaque, int ret) nvme_enqueue_req_completion(nvme_cq(req), req); } =20 +struct nvme_e2e_ctx { + NvmeRequest *req; + QEMUIOVector iov; + uint8_t *bounce; +}; + +static void nvme_e2e_rw_cb(void *opaque, int ret) +{ + struct nvme_e2e_ctx *ctx =3D opaque; + NvmeRequest *req =3D ctx->req; + + trace_pci_nvme_e2e_rw_cb(nvme_cid(req)); + + qemu_iovec_destroy(&ctx->iov); + g_free(ctx->bounce); + g_free(ctx); + + nvme_rw_cb(req, ret); +} + +static void nvme_e2e_rw_check_cb(void *opaque, int ret) +{ + struct nvme_e2e_ctx *ctx =3D opaque; + NvmeRequest *req =3D ctx->req; + NvmeNamespace *ns =3D req->ns; + NvmeCtrl *n =3D nvme_ctrl(req); + NvmeRwCmd *rw =3D (NvmeRwCmd *)&req->cmd; + uint32_t nlb =3D le16_to_cpu(rw->nlb) + 1; + uint16_t ctrl =3D le16_to_cpu(rw->control); + uint16_t apptag =3D le16_to_cpu(rw->apptag); + uint16_t appmask =3D le16_to_cpu(rw->appmask); + uint32_t reftag =3D le32_to_cpu(rw->reftag); + uint16_t status; + + trace_pci_nvme_e2e_rw_check_cb(nvme_cid(req), NVME_RW_PRINFO(ctrl), ap= ptag, + appmask, reftag); + + if (ret) { + goto out; + } + + status =3D nvme_e2e_check(ns, ctx->bounce, ctx->iov.size, ctrl, apptag, + appmask, reftag); + if (status) { + req->status =3D status; + goto out; + } + + if (ctrl & NVME_RW_PRINFO_PRACT && nvme_ns_ms(ns) =3D=3D 8) { + size_t lba_size =3D 1 << nvme_ns_lbads(ns); + + status =3D nvme_tx_interleaved(n, ctx->bounce, nvme_l2b(ns, nlb), + lba_size, sizeof(NvmeDifTuple), + NVME_TX_DIRECTION_FROM_DEVICE, req); + } else { + status =3D nvme_tx(n, ctx->bounce, nvme_l2b(ns, nlb), + NVME_TX_DIRECTION_FROM_DEVICE, req); + } + + if (status) { + req->status =3D status; + } + +out: + nvme_e2e_rw_cb(ctx, ret); +} + static void nvme_aio_discard_cb(void *opaque, int ret) { NvmeRequest *req =3D opaque; @@ -1047,6 +1321,12 @@ static void nvme_compare_cb(void *opaque, int ret) { NvmeRequest *req =3D opaque; NvmeNamespace *ns =3D req->ns; + NvmeRwCmd *rw =3D (NvmeRwCmd *)&req->cmd; + uint32_t nlb =3D le16_to_cpu(rw->nlb) + 1; + uint16_t ctrl =3D le16_to_cpu(rw->control); + uint16_t apptag =3D le16_to_cpu(rw->apptag); + uint16_t appmask =3D le16_to_cpu(rw->appmask); + uint32_t reftag =3D le32_to_cpu(rw->reftag); struct nvme_compare_ctx *ctx =3D req->opaque; g_autofree uint8_t *buf =3D NULL; uint16_t status; @@ -1061,6 +1341,15 @@ static void nvme_compare_cb(void *opaque, int ret) goto out; } =20 + if (NVME_ID_NS_DPS_TYPE(ns->id_ns.dps)) { + status =3D nvme_e2e_check(ns, ctx->bounce, ctx->iov.size, ctrl, + apptag, appmask, reftag); + if (status) { + req->status =3D status; + goto out; + } + } + buf =3D g_malloc(ctx->len); =20 status =3D nvme_h2c(nvme_ctrl(req), buf, ctx->len, req); @@ -1069,6 +1358,50 @@ static void nvme_compare_cb(void *opaque, int ret) goto out; } =20 + if (NVME_ID_NS_DPS_TYPE(ns->id_ns.dps)) { + size_t stride =3D nvme_l2b(ns, 1); + uint16_t ms =3D nvme_ns_ms(ns); + uint64_t pos =3D 0; + bool first =3D !!(ns->id_ns.dps & NVME_ID_NS_DPS_FIRST_EIGHT); + size_t cmp_len; + + status =3D nvme_e2e_check(ns, buf, nlb, ctrl, apptag, appmask, ref= tag); + if (status) { + req->status =3D status; + goto out; + } + + /* + * When formatted with protection information, do not compare the = DIF + * tuple. + */ + cmp_len =3D 1 << nvme_ns_lbads(ns); + if (!first) { + cmp_len +=3D ms - sizeof(NvmeDifTuple); + } + + for (int i =3D 0; i < nlb; i++) { + if (memcmp(buf + pos, ctx->bounce + pos, cmp_len)) { + req->status =3D NVME_CMP_FAILURE; + break; + } + + if (!first) { + pos +=3D stride; + continue; + } + + pos +=3D cmp_len + sizeof(NvmeDifTuple); + if (memcmp(buf + pos, ctx->bounce + pos, + ms - sizeof(NvmeDifTuple))) { + req->status =3D NVME_CMP_FAILURE; + break; + } + } + + goto out; + } + if (memcmp(buf, ctx->bounce, ctx->len)) { req->status =3D NVME_CMP_FAILURE; } @@ -1162,12 +1495,24 @@ static uint16_t nvme_compare(NvmeCtrl *n, NvmeReque= st *req) uint32_t nlb =3D le16_to_cpu(rw->nlb) + 1; size_t len =3D nvme_l2b(ns, nlb); int64_t offset =3D nvme_l2b(ns, slba); + uint16_t ctrl =3D le16_to_cpu(rw->control); + uint32_t reftag =3D le32_to_cpu(rw->reftag); uint8_t *bounce =3D NULL; struct nvme_compare_ctx *ctx =3D NULL; uint16_t status; =20 trace_pci_nvme_compare(nvme_cid(req), nvme_nsid(ns), slba, nlb); =20 + status =3D nvme_check_prinfo(ns, ctrl, slba, reftag); + if (status) { + return status; + } + + if (NVME_ID_NS_DPS_TYPE(ns->id_ns.dps) && + (ctrl & NVME_RW_PRINFO_PRACT)) { + return NVME_INVALID_PROT_INFO | NVME_DNR; + } + status =3D nvme_check_mdts(n, len); if (status) { trace_pci_nvme_err_mdts(nvme_cid(req), len); @@ -1220,10 +1565,23 @@ static uint16_t nvme_write_zeroes(NvmeCtrl *n, Nvme= Request *req) uint32_t nlb =3D (uint32_t)le16_to_cpu(rw->nlb) + 1; uint64_t offset =3D nvme_l2b(ns, slba); uint32_t count =3D nvme_l2b(ns, nlb); + uint16_t ctrl =3D le16_to_cpu(rw->control); + uint16_t apptag =3D le16_to_cpu(rw->apptag); + uint32_t reftag =3D le32_to_cpu(rw->reftag); + struct nvme_e2e_ctx *ctx; uint16_t status; =20 trace_pci_nvme_write_zeroes(nvme_cid(req), nvme_nsid(ns), slba, nlb); =20 + if (ctrl & NVME_RW_PRINFO_PRCHK_MASK) { + return NVME_INVALID_PROT_INFO | NVME_DNR; + } + + status =3D nvme_check_prinfo(ns, ctrl, slba, reftag); + if (status) { + return status; + } + status =3D nvme_check_bounds(ns, slba, nlb); if (status) { trace_pci_nvme_err_invalid_lba_range(slba, nlb, ns->id_ns.nsze); @@ -1232,19 +1590,118 @@ static uint16_t nvme_write_zeroes(NvmeCtrl *n, Nvm= eRequest *req) =20 block_acct_start(blk_get_stats(req->ns->blkconf.blk), &req->acct, 0, BLOCK_ACCT_WRITE); + + if (NVME_ID_NS_DPS_TYPE(ns->id_ns.dps) && (ctrl & NVME_RW_PRINFO_PRACT= )) { + ctx =3D g_new(struct nvme_e2e_ctx, 1); + ctx->req =3D req; + ctx->bounce =3D g_malloc0(count); + + qemu_iovec_init(&ctx->iov, 1); + qemu_iovec_add(&ctx->iov, ctx->bounce, count); + + /* splice generated protection information into the buffer */ + nvme_e2e_pract_generate_dif(ns, ctx->bounce, count, apptag, reftag= ); + + req->aiocb =3D blk_aio_pwritev(ns->blkconf.blk, offset, &ctx->iov,= 0, + nvme_e2e_rw_cb, ctx); + + return NVME_NO_COMPLETE; + } + req->aiocb =3D blk_aio_pwrite_zeroes(req->ns->blkconf.blk, offset, cou= nt, BDRV_REQ_MAY_UNMAP, nvme_rw_cb, req= ); return NVME_NO_COMPLETE; } =20 +static uint16_t nvme_e2e_rw(NvmeCtrl *n, NvmeRequest *req) +{ + NvmeRwCmd *rw =3D (NvmeRwCmd *)&req->cmd; + NvmeNamespace *ns =3D req->ns; + uint32_t nlb =3D (uint32_t)le16_to_cpu(rw->nlb) + 1; + uint64_t slba =3D le64_to_cpu(rw->slba); + size_t len =3D nvme_l2b(ns, nlb); + size_t offset =3D nvme_l2b(ns, slba); + uint16_t ctrl =3D le16_to_cpu(rw->control); + uint16_t apptag =3D le16_to_cpu(rw->apptag); + uint16_t appmask =3D le16_to_cpu(rw->appmask); + uint32_t reftag =3D le32_to_cpu(rw->reftag); + struct nvme_e2e_ctx *ctx; + uint16_t status; + + trace_pci_nvme_e2e_rw(!!(ctrl & NVME_RW_PRINFO_PRACT)); + + status =3D nvme_check_prinfo(ns, ctrl, slba, reftag); + if (status) { + return status; + } + + ctx =3D g_new(struct nvme_e2e_ctx, 1); + ctx->req =3D req; + ctx->bounce =3D g_malloc(len); + + qemu_iovec_init(&ctx->iov, 1); + qemu_iovec_add(&ctx->iov, ctx->bounce, len); + + if (req->cmd.opcode =3D=3D NVME_CMD_READ) { + req->aiocb =3D blk_aio_preadv(ns->blkconf.blk, offset, &ctx->iov, = 0, + nvme_e2e_rw_check_cb, ctx); + return NVME_NO_COMPLETE; + } + + if (ctrl & NVME_RW_PRINFO_PRACT && nvme_ns_ms(ns) =3D=3D 8) { + size_t lba_size =3D 1 << nvme_ns_lbads(ns); + + /* + * For writes, transfer logical block data interleaved into a meta= data + * extended buffer and splice the generated protection information= into + * it afterwards. + */ + status =3D nvme_tx_interleaved(n, ctx->bounce, len, lba_size, 8, + NVME_TX_DIRECTION_TO_DEVICE, req); + if (status) { + goto err; + } + } else { + status =3D nvme_tx(n, ctx->bounce, len, NVME_TX_DIRECTION_TO_DEVIC= E, + req); + if (status) { + goto err; + } + } + + if (ctrl & NVME_RW_PRINFO_PRACT) { + /* splice generated protection information into the buffer */ + nvme_e2e_pract_generate_dif(ns, ctx->bounce, len, apptag, reftag); + } else { + status =3D nvme_e2e_check(ns, ctx->bounce, len, ctrl, apptag, appm= ask, + reftag); + if (status) { + goto err; + } + } + + req->aiocb =3D blk_aio_pwritev(ns->blkconf.blk, offset, &ctx->iov, 0, + nvme_e2e_rw_cb, ctx); + + return NVME_NO_COMPLETE; + +err: + g_free(ctx->bounce); + g_free(ctx); + + return status; +} + static uint16_t nvme_rw(NvmeCtrl *n, NvmeRequest *req) { NvmeRwCmd *rw =3D (NvmeRwCmd *)&req->cmd; NvmeNamespace *ns =3D req->ns; uint32_t nlb =3D (uint32_t)le16_to_cpu(rw->nlb) + 1; uint64_t slba =3D le64_to_cpu(rw->slba); + uint16_t ctrl =3D le16_to_cpu(rw->control); =20 uint64_t data_size =3D nvme_l2b(ns, nlb); + uint64_t real_data_size =3D data_size; uint64_t data_offset =3D nvme_l2b(ns, slba); enum BlockAcctType acct =3D req->cmd.opcode =3D=3D NVME_CMD_WRITE ? BLOCK_ACCT_WRITE : BLOCK_ACCT_READ; @@ -1252,6 +1709,17 @@ static uint16_t nvme_rw(NvmeCtrl *n, NvmeRequest *re= q) uint16_t status; uint32_t sector_size; =20 + /* + * If the namespace is formatted with protecting information, the numb= er of + * metadata bytes is exactly 8 and the PRACT field is set, then the + * metadata is not resident in the host buffer. + */ + if (NVME_ID_NS_DPS_TYPE(ns->id_ns.dps) && + (nvme_ns_ms(ns) =3D=3D sizeof(NvmeDifTuple)) && + (ctrl & NVME_RW_PRINFO_PRACT)) { + data_size -=3D nlb << 3; + } + trace_pci_nvme_rw(nvme_cid(req), nvme_io_opc_str(rw->opcode), nvme_nsid(ns), nlb, data_size, slba); =20 @@ -1281,7 +1749,12 @@ static uint16_t nvme_rw(NvmeCtrl *n, NvmeRequest *re= q) goto invalid; } =20 - block_acct_start(blk_get_stats(blk), &req->acct, data_size, acct); + block_acct_start(blk_get_stats(blk), &req->acct, real_data_size, acct); + + if (NVME_ID_NS_DPS_TYPE(ns->id_ns.dps)) { + return nvme_e2e_rw(n, req); + } + if (req->qsg.sg) { sector_size =3D nvme_l2b(ns, 1); if (acct =3D=3D BLOCK_ACCT_WRITE) { diff --git a/hw/block/trace-events b/hw/block/trace-events index 68a4c8ed35e0..0ae5676cc28a 100644 --- a/hw/block/trace-events +++ b/hw/block/trace-events @@ -30,6 +30,7 @@ hd_geometry_guess(void *blk, uint32_t cyls, uint32_t head= s, uint32_t secs, int t # nvme.c # nvme traces for successful events pci_nvme_register_namespace(uint32_t nsid) "nsid %"PRIu32"" +pci_nvme_ns_init_pi(int blocks, int total) "blocks %d/%d" pci_nvme_irq_msix(uint32_t vector) "raising MSI-X IRQ vector %u" pci_nvme_irq_pin(void) "pulsing IRQ pin" pci_nvme_irq_masked(void) "IRQ is masked" @@ -42,6 +43,15 @@ pci_nvme_io_cmd(uint16_t cid, uint32_t nsid, uint16_t sq= id, uint8_t opcode, cons pci_nvme_admin_cmd(uint16_t cid, uint16_t sqid, uint8_t opcode, const char= *opname) "cid %"PRIu16" sqid %"PRIu16" opc 0x%"PRIx8" opname '%s'" pci_nvme_rw(uint16_t cid, const char *verb, uint32_t nsid, uint32_t nlb, u= int64_t count, uint64_t lba) "cid %"PRIu16" opname '%s' nsid %"PRIu32" nlb = %"PRIu32" count %"PRIu64" lba 0x%"PRIx64"" pci_nvme_rw_cb(uint16_t cid, const char *blkname) "cid %"PRIu16" blk '%s'" +pci_nvme_e2e_rw(uint8_t pract) "pract 0x%"PRIx8"" +pci_nvme_e2e_rw_cb(uint16_t cid) "cid %"PRIu16"" +pci_nvme_e2e_rw_check_cb(uint16_t cid, uint8_t prinfo, uint16_t apptag, ui= nt16_t appmask, uint32_t reftag) "cid %"PRIu16" prinfo 0x%"PRIx8" apptag 0x= %"PRIx16" appmask 0x%"PRIx16" reftag 0x%"PRIx32"" +pci_nvme_e2e_pract_generate_dif(size_t len, size_t lba_size, size_t chksum= _len, uint16_t apptag, uint32_t reftag) "len %zu lba_size %zu chksum_len %z= u apptag 0x%"PRIx16" reftag 0x%"PRIx32"" +pci_nvme_e2e_check(uint8_t prinfo, uint16_t chksum_len) "prinfo 0x%"PRIx8"= chksum_len %"PRIu16"" +pci_nvme_e2e_prchk_disabled(uint16_t apptag, uint32_t reftag) "apptag 0x%"= PRIx16" reftag 0x%"PRIx32"" +pci_nvme_e2e_prchk_guard(uint16_t guard, uint16_t crc) "guard 0x%"PRIx16" = crc 0x%"PRIx16"" +pci_nvme_e2e_prchk_apptag(uint16_t apptag, uint16_t elbat, uint16_t elbatm= ) "apptag 0x%"PRIx16" elbat 0x%"PRIx16" elbatm 0x%"PRIx16"" +pci_nvme_e2e_prchk_reftag(uint32_t reftag, uint32_t elbrt) "reftag 0x%"PRI= x32" elbrt 0x%"PRIx32"" pci_nvme_write_zeroes(uint16_t cid, uint32_t nsid, uint64_t slba, uint32_t= nlb) "cid %"PRIu16" nsid %"PRIu32" slba %"PRIu64" nlb %"PRIu32"" pci_nvme_block_status(int64_t offset, int64_t bytes, int64_t pnum, int ret= , bool zeroed) "offset %"PRId64" bytes %"PRId64" pnum %"PRId64" ret 0x%x ze= roed %d" pci_nvme_dsm(uint16_t cid, uint32_t nsid, uint32_t nr, uint32_t attr) "cid= %"PRIu16" nsid %"PRIu32" nr %"PRIu32" attr 0x%"PRIx32"" --=20 2.29.2